Log in Article Discussion Edit History Go to the site toolbox

HTM ComDoc 13.

From HTMcommunityDB.org


This page has been updated. Go to HTM ComDoc 13

The application of Failure modes and effects analysis to medical device-related processes

This material was originally presented by Malcolm G. Ridgway, Ph.D., CCE as Chapter 10 of the book "A Practicum for Biomedical Engineering & Technology Management Issues" Edited by Leslie R. Atles, CCE, CBET and published by the Kendall/ Hunt Publishing Company, Dubuque, Iowa in 2008. This book is now out of print.

(This material was last revised on 2-16-15)

13.1 Introduction

Ever since the shocking revelations of the Institute of Medicine’s report in 1998 (HTM ComRef 20.) about the huge number of medical errors made in U.S. hospitals, there has been absolutely no question that reducing the risk of what the Joint Commission (JC) calls “unanticipated adverse events” (UAEs) is an extremely important part of the performance improvement agenda of all healthcare organizations. Failure modes and effects analysis (FMEA) is currently one of the more popular methods being employed in healthcare to reduce the occurrence of UAEs, and it is one that has been specifically cited by the Joint Commission as a useful tool, ever since it was first mentioned in the JC standards in 2002. Although it is not the only method of analyzing risk, it is a relatively simple, commonsense approach, and one that has been validated and widely used for many years in many other industries. The VA (Department of Veteran’s Affairs) National Center for Patient Safety (NCPS) has done extensive work in this particular area and has developed a specially adapted version of FMEA, which is known as Healthcare Failure Modes and Effects Analysis (HFMEA) for use throughout the VA’s healthcare facilities. (HTM ComRef 21.)

13.2 What is FMEA?

In essence, FMEA provides a systematic, step-by-step method for identifying system vulnerabilities—ways in which things could go wrong in processes such as patient care (and medical device related processes) — and prioritizing those vulnerabilities with respect to the likeliness that those things will happen, and the potential criticality or seriousness of the consequences of those process failures. It often reveals unexpected opportunities to anticipate errors and put in place effective preventive or mitigation measures. The analysis is very similar to the modern investigative technique known as root cause analysis (RCA), which the Joint Commission (JC) requires hospitals to use when investigating a serious patient care accident or incident—what they call a Sentinel event. The primary difference between the two is that FMEA is proactive and is conducted prospectively before there have been any accidents or near misses, whereas RCA is reactive and is conducted retrospectively after an accident or incident has occurred. Both techniques are based on similar systems analysis concepts, and they are generally considered to be equally powerful.

The use of modern investigative methods such as RCA and FMEA signals an important departure from the “blame-game” approach that has traditionally dominated the investigation of patient injuries in healthcare institutions. In contrast to the traditional hunt to find someone to blame, these methods look for what are generally described in reliability engineering terminology as root causes—underlying factors that can predispose existing processes to adverse outcomes.

The FMEA methodology consists of eight steps.

FMEA Step 1. Mapping the Process

The first step in the analysis is to create a functional description of the process under investigation by breaking it down into its component functional steps. In FMEA jargon, this is called mapping the process. This first step is extremely important with respect to generating a crystal-clear understanding of exactly how the process works. It can be a long and painstaking part of the analysis, but without this initial exercise, it is more difficult to identify the many ways in which the process might go wrong.

This systematic description should represent what actually happens—as described first-hand by the people involved — rather than what is supposed to happen according to some procedure in an “official” work practice manual or similar document. This first, basic step can be very helpful to the investigation because it frequently reveals discrepancies—actual practices that are at odds with the theoretically prescribed, “official” practices — and sometimes it also reveals that actual practices vary significantly within different parts of the organization.

If the process consists of a linear sequence of functional steps, then a simple list of those steps is sufficient to describe the process. If the process turns out to have decision points and subsequent branches, forming a network rather than a linear sequence, it is best described through the use of a process flowchart.

In either case, although the initial mapping needs only include the minimum number of steps required to describe the entire process completely, it can be helpful at the outset to break each of the basic, high-level steps into their constituent substeps. Either way, each of the steps should be described as completely as possible, including any quantitative performance requirements. An article published in Health Devices in July 2004 titled “Failure Mode and Effects Analysis; A Hands-On Guide for Healthcare Facilities” (HTM ComRef 22.) provides an excellent model for performing this type of analysis. It addresses the process of using a modern infusion pump to deliver a drug to a patient. The article starts out by describing the process in the form of a list of six high-level steps. It then breaks down each of those high-level steps into the more than 20 substeps shown here:

1. The patient and the drug are verified against the order.

a. The order is brought to the bedside.
b. The patient is identified.
c. The drug in the IV fluid container is identified.
d. The patient ID is checked against the order.
e. The drug name and concentration is checked against the order.
f. The delivery route is verified against the order.

2. The pump is turned on.

a. The power button is pressed.
b. It is verified that the pump is plugged in and the power light is on.

3. The infusion set is connected and primed.

a. The infusion set is located and verified.
b. The set is connected to the IV fluid container.
c. The IV fluid container is installed and prepared.
d. The set is primed.
e. The set clamps are closed.
f. The set is connected to the patient’s port.

4. The pump is programmed.

a. The programming mode is identified.
b. The drug name and clinical location is identified.
c. The patient’s weight is entered.
d. The solution concentration or drug content and diluent volume are entered.
e. The rate or dose is entered.
f. The volume to be infused (VTBI) is entered.

5. The programming is verified against the order.

a. The drug name and concentration on the container are checked vs. the order.
b. The rate or dose is checked vs. the order.
c. The VTBI is checked vs. the order.
d. The Confirm/Accept button pressed.

6. Delivery is started (and the drug is delivered to the patient at the programmed rate).

a. The Start button is pressed.
b. Initiation of pumping is verified.
c. The pump delivers the solution or drug at the programmed rate +/– 5%.

In the exercise described in the Health Devices article, the team does not list the very last substep (6c). However, it is instructive for our purpose here to include this last step. As we will see, the function embodied in a particular step such as this should be described as completely as possible in precise, quantitative terms — in this case, “pump delivers the solution or drug at the programmed rate +/– 5%” rather than a more simple qualitative manner such as “pump delivers solution or drug.” The more complete description allows the investigating team to more easily capture all of the possible ways in which each step (and thus the entire process) could fail.

FMEA Step 2. Identifying the Process’s Failure Modes

In FMEA jargon, the ways in which the process can go wrong and fail to accomplish its primary objective(s) are called failure modes. It is important when conducting an FMEA analysis to distinguish between failure modes and failure causes. A failure mode is identified by answering the question, “How (in what manner, in what way) can each of the identified steps or subprocesses fail?” rather than “Why — what could cause each of the identified potential failures to occur?” A failure mode is usually described by a phrase containing a noun and a verb, as in: The clinician misreads the ID; the wrong button is pressed; or the wrong concentration is selected from the pick list.

A complete listing of a process’s failure modes identifies all of the possible ways in which the process could fail — that is, be stopped from progressing through to successful completion of the process.

So each substep included in the complete process description is analyzed to identify ways in which it could go wrong. In the Health Devices article previously cited, the investigating team gives as an example the following seven potential failure modes for substep 4e. “The rate or dose is entered.” In this case, the investigating team determined that the following things could go wrong:

1. The rate or dose cannot be located or read properly.
2. The wrong dose is entered.
3. The wrong units are selected.
4. A calculation error is made.
5. A rate or dose error is detected and corrected but the “start” key is not pressed.
6. A wrong order is read.
7. No value is entered.

The article also points out that many failure modes fall into one or more of the following three general categories. This short list may prove helpful to analysts trying to identify potential failure modes.

A step or substep was performed incompletely or wrong (as in #2 and #6).
A step or substep was attempted correctly but some defect in a tool or accessory caused a problem (as in #1).
A step or substep was omitted (as in #7).

It is also helpful to think in terms of potential failed states that would result from the possible functional failures. For example, in the very last sub-step that we added in 6c, “Pump delivers the solution or drug at the programmed rate +/- 5%”, supposed that the investigating team would confirm that the total process would fail if any one of following failed states occurred:

1. The pump transferred no fluid at all.
2. The pump transferred the fluid at too low a rate.
3. The pump transferred the fluid at too high a rate.

There would then be three specific failure modes corresponding to each of these three failed states:

1. The pump transfers no fluid at all.
2. The pump does not transfer the fluid quickly enough.
3. The pump transfers the fluid too quickly.

Note that each of the identified failure modes can have many different underlying causes, but these are not addressed until later in the analysis, in Step 7.

Next we must narrow down the analysis by:

  • identifying the possible severity of the adverse consequences of each of the possible failure modes (in Step 3),
  • estimating the relative likelihood of each of these adverse “effects” actually occurring (in Step 4), and then
  • assessing the detectability of each adverse occurrence (in Step 5).

A problem or threat that is likely to be obvious to the staff users or attendants is considered to be less critical than a threat that is unlikely to be detected in time for some kind of preventive intervention or accident-avoidance action.

The investigating team must then decide how deeply it wishes to extend the analysis beyond the most critical adverse consequences. After each of these factors is evaluated and quantified, they are ranked according to their relative criticality or seriousness (in Step 6).

Several different kinds of adverse consequences can be taken into consideration. Which ones are selected depends on the objectives of the analysis. In applying FMEA to the use of medical devices, one very important area to consider is patient safety. Another safety-related area that could be considered is potential injury of the device operators, and yet another is possible adverse impact on the immediate and less-immediate environment surrounding the patient. Sometimes this type of analysis is also used to investigate possible adverse economic impacts on the institution resulting from a slowing down of the operational process itself—for example, the adverse impact on the productivity of the facility if an important resource such as the facility’s only CT scanner or blood chemistry analyzer breaks down.

A third area of adverse consequences that might be considered is the cost of repairs resulting from preventable equipment failures. The latter two economic areas are more likely to be considered if the objective of the analysis has to do with selecting optimum equipment maintenance strategies.

FMEA Step 3. Identifying and Quantifying the Severity of the Possible Adverse Consequences (Effects) of the Process’s Failure Modes

With respect to estimating the severity of the possible adverse consequences of a particular failure mode, or the corresponding failed state, such as the failure of the infusion pump to turn on, it is only necessary to allocate the potential consequence to one of several relatively broad categories. Several scales have been developed, but two appear to have been used most frequently. Juran’s classic quantitative model uses a 10-point scale to rate the severity of the possible adverse effect. (HTM ComRef 23.) The other frequently used scale utilizes just four levels. The VA’s HFMEA model uses a four-point scale.

The use of broadly defined scales such as this has been criticized as being imprecise or even arbitrary, but it is important to recognize that these numerical scales are used only to provide a simple, practical tool to enable the investigation team to rank the relative criticality or seriousness of the potential adverse outcomes. Events with similar criticality rankings are not necessarily identical, but they are broadly comparable in terms of their potential overall impact. The use of these tools provides a simple and useful way to perform this kind of analysis, and they have been widely accepted within the reliability engineering community. The Joint Commission does not require the use of any particular scale. Their guideline publication on FMEA (HTM ComRef 24.) provides sample 10-point scales for severity rating (as well as for the probability of occurrence and detectability). The VA’s four-point rating scale for severity, as it relates to patient safety, is shown below. The FMEA investigation team will be called upon to use their collective professional judgment to determine where to place the various findings or estimates on these numerical scales.

Level 4. Catastrophic event: Could be life-threatening or cause a major injury
Level 3. Major event: Could cause injury with permanent lessening of bodily function
Level 2. Moderate event: Could cause increased level of care or length of stay
Level 1. Minor event: No resulting injury or increase in level of care or length of stay

In our example of the infusion pump analysis, some of the potential failure modes have a consequence (such as the wrong drug being administered) that is clearly in the category of potentially catastrophic. This would rate at Level 4 on the VA’s HFMEA scale.

FMEA Step 4. Identifying and Quantifying the Probability That the Failure Modes Will Occur

With respect to estimating how likely it is that this kind of event will happen, it is helpful to try to find some documented statistics on the occurrence of this type of incident in a reasonably reliable data repository, such as the FDA’s Manufacturer and User Facility Device Experience (MAUDE) database, or in a fairly recent report in one of the industry’s professional journals. Failing this, it will be up to the technical members of the FMEA team to provide their professional judgment on the most likely probability of this type of event occurring. The VA’s four-point rating scale for probability of occurrence is shown next:

Level 4. Frequent: Likely to occur immediately or within a short period (several occurrences in 1 year)
Level 3. Occasional: Probably will occur (several occurrences in 1–2 years)
Level 2. Uncommon: Possible that it will occur (one occurrence sometime in 2–5 years)
Level 1. Remote: Unlikely to occur (one occurrence sometime in 5–30 years)

FMEA Step 5. Identifying and Quantifying the Detectability of These Possible Failure Modes; and

FMEA Step 6. Combining These Factors into a Composite Measure of Seriousness or Criticality

The classic 10-point rating of detectability is used in conjunction with a criticality index (CI) or risk priority number (RPN) in which the CI or RPN is obtained by multiplying together all three of the 10-point ratings. Thus, the classic RPN or CI is represented by a number between 1 and 1,000. The VA’s system combines the two 4-level variables into a numerical hazard score, which can vary between 1 and 16, by multiplying together the two 4-point ratings for severity and probability of occurrence. An event that is rated at Level 4 in either category is given a quantitative rating of 4 points, and an event that is rated at Level 3 in either category is given a quantitative rating of 3 points, and so on. (See Figure 13.1 - Title?)

This quantitative representation of the level of risk associated with this particular failure mode in the form of either the FMEA hazard score or the classic RPN is an attempt to provide an objective measure of the relative seriousness of the various hazardous outcomes in order to determine where best to apply the facility’s scarce resources in order to achieve the greatest improvement in patient safety. Hazards with higher scores represent higher priority targets for mitigation.

The VA uses a different method of factoring in detectability. Rather than create a compound index by multiplying together three scales, including one for detectability, it chooses to use a separate decision tree (reproduced here as Figure 13.2 - Healthcare failure mode and effects analysis detectability decision tree') to determine whether the failure mode in question warrants further consideration. The algorithm built into the decision tree indicates that the outcome in question will continue to be investigated if all three of the following statements are true:

1. The product of the severity and probability scores (the hazard score) is 8 or greater.
2. There are no control measures already in place.
3. The hazard cannot be detected before it causes damage.

Let us speculate that the team makes an estimate that a certain type of failure is uncommon (i.e., could be expected to occur once in two to five years). This is a Level 2 probability of occurrence on the HFMEA rating scale. In the VA’s hazard scoring matrix, the combined rating would become a hazard score of (4 • 2) = 8. Consulting the HFMEA decision tree, in the first box, the hazard would qualify as warranting that it be controlled (because of its hazard score of 8), and if it also qualifies for a finding of “No” in Box #3 because there is no known-to-be effective control measure, and in Box #4 because the hazard is not readily detectable, then this particular failure mode becomes a candidate for trying to find some kind of control measure.

FMEA Step 7. Identifying the Possible Causes of Failure

Identifying each of the chains of causes underlying each failure mode, by repeatedly asking the question, “How could this occur?” is often referred to as “peeling the onion,” and it should be continued until there are no further answers that identify causes that are within the ability of the investigating team to control. When there are no further answers to the question down one particular chain, those particular underlying causes are considered to be root causes.

In our previous example, one of the causes underlying the failure mode identified as #5 might be that the clinician is unfamiliar with the change procedure for this particular infusion pump. Drilling down further and asking how this could happen might bring the investigating team to a consideration of whether it is reasonable to rely completely on periodic in-service user training on the change procedure, particularly if the training program is not carefully designed or given a reasonably high priority by the facility because of a shortage of training resources. Underlying resource-allocation-type causes are fundamental issues that could depend on decisions made at a fairly high level in the organization. It is for the team to decide whether these types of causes are true root causes.

For example, it could be argued that there are further causes underlying these issues, such as the institution having very limited funds available. These causes are, however, probably beyond the jurisdiction and control of the investigating team.

When brainstorming possible causes of particular failure modes, one widely recommended technique is to explore each of the following five general areas for sources of possible problems:

1. Equipment-related issues (e.g., availability of equipment; human factors design of the equipment; maintenance of the equipment; etc.)
2. Other people-related issues (e.g., staffing levels; scheduling; competence assessment; training; communication; etc.)
3. Materials-related issues (e.g., availability of supplies; misplaced supplies; etc.)
4. Process design-related issues (e.g., relevant policies of the institution and procedures used by the staff; etc.)
5. Environment-related issues (e.g., lighting levels; noise levels and other possible sources of distraction; the supply of power and other sources of support that may be required; etc.)

The diversity of these five areas is one of the reasons that an FMEA is best undertaken by a multidisciplinary team. Of course, clinical engineers and biomedical equipment technicians are particularly qualified to address any equipment-related issues. And it is probably also fair to observe that whenever there is equipment involved in a process under investigation, there is a tendency on the part of the personnel involved in the process to point first to the possibility that the problem lies with the equipment.

FMEA Step 8. Identifying How the Vulnerabilities Revealed in the Selected Process Can Be Eliminated or Reduced through Control Measures, and at What Relative Cost

There are four general guidelines when it comes to choosing the best control measures:

1. Try to place the control measure at the earliest possible point in the process.
2. Multiple control measures are better than relying on just one.
3. There are different types of control measures, and some types are generally more effective than others. The list (in Figure 13.3 - The relative effectiveness of various control and mitigation strategies) presents several different types of mitigation strategies and control measures in their approximate order of effectiveness (number 1 being more effective than number 10).
4. If there appears to be no effective control measures, consideration should be given to completely redesigning the entire process.

13.3 Testing Process Modifications Prior to Full Implementation

A word of caution is in order here. Once the analysis and subsequent corrective actions have been completed, it is important to reanalyze the new process. Sometimes process modifications introduce other, unintended and unexpected hazards. It is a very good idea prior to going forward with full-scale implementation of the modifications to conduct one or more pilot tests to confirm that the intended improvements have indeed been achieved.

13.4 Evidence-based Design

One final word about a very promising byproduct of the increasing adoption of FMEA practices in healthcare is the emergence of consulting organizations that specialize in collaborating with architects and engineers to review designs for new medical facilities and equipment while they are still on the drawing board. Conducting an FMEA on proposed designs at this stage can expose flaws that would generate potential hazards. Mitigating changes at this point in the production cycle helps assure that an improved level of safety is literally built into the design of these new resources. One particularly interesting new development is the growing use of independent, third-party human factors laboratories to evaluate medical devices while they are still in the prototype stage.

13.5 Summary

Failure Modes and Effects Analysis (FMEA) is a powerful but relatively simple and practical tool for analyzing patient care processes, particularly those involving medical devices. Its increasing use in healthcare facilities — encouraged by The Joint Commission — promises to shift the focus of health-care professionals charged with improving patient safety away from the traditional “culture of blame” and more toward the systematic identification of root causes that predispose circumstances surrounding the patient to a greater likelihood of accidents. In addition to the references cited in this chapter there are an endless number of additional resources that can be found by entering “failure mode and effectiveness analysis in healthcare” into any of the popular online search engines. It is a rapidly expanding area of interest. The response received when this inquiry was entered into the Google search engine in August 2007 was about 1,760,000 hits. No doubt this trend will continue.

Site Toolbox:

Personal tools
This page was last modified 22:37, 2 October 2018. - This page has been accessed 20,077 times. - Disclaimers - About HTMcommunityDB.org