Common Errors in Healthcare Failure Mode and Effects Analysis (HFMEA®)

With the burgeoning interest in healthcare failure mode and effects analysis (HFMEA®), there have unfortunately been systematic errors injected into the methodology, even in the examples promulgated by the leaders in the patient safety movement. The errors are in design and application of the HFMEA methodology, as it was transitioned from its non-healthcare origins to its current healthcare applications. The problems we have noted fall typically into the following areas:


Analysis of System versus Process. As with a root cause analysis, the first and one of the most important steps in conducting an effective failure mode and effects analysis is selection of a suitable topic or problem. In both of these analytic techniques we have observed a strong tendency to select a system rather than a process for analysis. Conducting an analysis on a system can certainly be done, but the best way to do this is actually to perform a separate analysis upon each process within the system and then integrate the results. The sheer complexity and workload involved in doing a comprehensive, systems-level analysis is virtually guaranteed to lead to both inadequate analysis and staff burnout and anger. Exactly what is the distinction between a process and a system? Essentially it is a matter of size and complexity. Any system is composed of multiple processes; two or more, generally many more. As a general rule of thumb, when you flowchart out a sequence of events (or when you list the sequence if you dislike flowcharting), any time you have more than 25 or so items you are likely dealing with a system, not a process. We have found this to be an effective guide to help appropriately limit our scope of analysis.


Low Prioritization of Low-Frequency, High-Severity Outcomes. This problem results directly from the origin of FMEA – the automotive industry and its attempts to improve product development. In its original form FMEA typically did not address outcomes that would be considered calamitous, though they might be severe – the problems addressed were usually of the nature of increased costs due to products being out of specification, production line slowdowns, lost productivity, etc. Even very severe outcomes (Effects) in that environment were “acceptable” if they were sufficiently rare. In healthcare, however, an ever-present potential outcome is death and/or disability. Even an infrequent but avoidable death is not acceptable. Hence the problem… Criticality is the arithmetic product of Severity and Occurrence ratings. This is true whether one uses the traditional scoring range of 1-10, the simplified scoring system used by the VA National Center for Patient Safety (1-4) or any other scoring schema. It is, however, easiest to explain when using the standard scoring range as an example. Consider an analysis in which there are two Effects of different Failure Modes. One Effect’s severity is minor (it has been scored “2”) and occurs very frequently (Occurrence has been scored “10”). The other Effect occurs very rarely (Occurrence has been scored “1”), but it can lead to patient death (its Severity is therefore scored “10”). The Criticality of the first Failure Mode is “20” and that of the second is “10.” In terms of prioritization of intervention, one would therefore address the first Failure Mode and its frequent but minor adverse Effect before one addressed the Failure Mode which might lead to patient death. This is not a failure of the methodology. It is only the lack of recognition that the original methodology should be modified slightly so as to make the prioritization of intervention match the actual importance of the potential adverse outcomes – and this is in fact the entire purpose for conducting an FMEA. The compensation? Simply place ALL Failure Modes that lead to the highest-scoring Severity outcome high in the prioritization of issues to be addressed, irrelevant of the numerical scoring.


Prioritization by Criticality versus Risk Priority Number (RPN). Several organizations have elected to base their prioritization of intervention upon Criticality rather than Risk Priority Number (RPN). This ignores the value of the detection ratings which constitute an integral part of the HFMEA methodology. Even the decision tree technique developed by the VA does not allow quantitative comparison in the way that the traditional Detection scoring system permits. It is a simplification that carries with it a regrettable loss of robustness in the decision-making process.


Analysis by Failure Mode versus by Contributory Factor. This is the error that has the most far-reaching consequences because it is failure of underlying philosophy as well as application. It is also made by some in the non-healthcare arena who use an over-simplified variant of FMEA. It is important to emphasize here that the only reason for conducting an FMEA or HFMEA is to identify potential process failures before they occur and to prioritize interventions in accordance with resources available and importance of the possible consequences of those failures. In conducting an FMEA, one identifies (among other tasks) the following basic elements: Functions, Failure Modes, Effects and Causes. A Function is how something is supposed to work (e.g., IV flow rate at exactly 20cc per minute). A Failure Mode is how that Function might go awry (e.g., higher flow rate, lower flow rate, no flow, inconsistent flow – each of these is a Failure Mode). An Effect is the consequence of the Failure Mode (e.g., dehydration, edema…). A Contributory Factor is something that leads to its super-ordinate Failure Mode (e.g., equipment malfunction, setup error by staff, etc.). It is possible to calculate Criticality (and therefore RPN) based upon the Occurrence of either the Failure Mode or its subordinate Contributory Factors. These calculations in turn determine which problem areas are identified for intervention (or the order in which they are addressed for that intervention). Herein lies the problem.


There can oftentimes be multiple causes of a single Failure Mode. In any process being examined, there will usually be several different Failure Modes. Consider such a situation. In one of these Failure Modes there exist three independent causes which result in the Failure Mode. One of these causes occurs fairly frequently, accounting for 80% of the occurrences of its super-ordinate Failure Mode. Another Failure Mode, whose Effect is equal in Severity to the first, also has four causes, these result with varying frequency in their super-ordinate Failure Mode. A third Failure Mode with equally severe Effect has two Causes, each of which leads to the Failure Mode with different frequency. All three Failure Modes occur with equal frequency overall – an Occurrence rating of “8.” For simplicity, let us assume that there are no detection systems; Criticality would therefore equal RPN. In an analysis by Failure Mode, all three Failure Modes would be seen as being equally important, and no distinction would be made in what to try to “fix” first. For the sake of simplification, we are using the same Severity rating for all Failure Modes – a score of “6.”


Failure Mode Occurrence Score Severity Score Criticality
FM1 (3 Causes) 8 5 48
FM2 (4 Causes) 8 5 48
FM3 (2 Causes) 8 5 48


There is no way to prioritize issues to address based upon the above. However, in an analysis by Cause (or Contributory Factor), we find significant differences, and clear guidance in terms of which issues to address first.



Occurrence Score

Severity Score Criticality
FM1 – Cause 1 6 6 36
FM1 – Cause 2 1 6 6
FM1 – Cause 3 1 6 6
FM2 – Cause 1 1 6 36
FM2 – Cause 2 4 6 24
FM2 – Cause 3 2 6 12
FM2 – Cause 4 1 6 6
FM3 – Cause 1 7 6 42
FM3 – Cause 2 1 6 6


In your real world of limited resources, which method tells you how to expend your resources most effectively… “Address all Causes for all Failure Modes” or “Address the Causes that account for 80% of the problem?”


HFMEA is a registered trademark ® of CCD Health Systems

Tell us what you think