Log in Article Discussion Edit History Go to the site toolbox

HTM ComDoc 4

From HTMcommunityDB.org


Consideration of the device's PM-related reliability

(This document was last revised on 9-5-18)

4.1 Introduction

The analysis described in HTM ComDoc 3 addresses whether a device could pose a risk of serious injury or death to a patient or staff member if the device should fail from a PM-preventable cause. In effect, this analysis separated the entire universe of medical devices into two categories. One category, which is similar to what the Centers for Medicare & Medicaid Services (CMS) characterizes as “critical equipment,” (HTM ComRef 28) makes up a sub-inventory of devices that, to be compliant with the intent of the CMS regulation, should continue to be maintained according to manufacturer recommendations. With the exception of four CMS-specified subcategories (HTM ComRef 33), the second category of devices can be incorporated into what the Task Force calls a Phase 1 AEM program. This categorization was achieved by using the first of the Task Force's two recommended risk criteria - the "severity of PM-related harm" criterion.

In HTM ComDoc 3, we mentioned a second AEM program inclusion criterion that the Task Force is calling the “likelihood of PM-preventable failures” criterion, which will identify manufacturer-model versions of "potential high PM-risk devices" that have been shown (and documented in Table 13) to be highly unlikely to fail from a PM-preventable cause. Although no specific language in the CMS regulation addresses this possibility, the Task Force believes that a good case can be made for these particular versions of device types, previously identified as potential high PM-risk devices, being made eligible for an AEM program because of a substantial, documented record of acceptable PM-related reliability. In this article, we describe the process for identifying these additional devices that are then considered eligible for adding into the AEM program, by manufacturer-model.

By combining the Task Force’s “likelihood of PM-preventable failures” criterion with the “severity of PM-related harm” criterion (see Figure 4.2 AEM eligibility based on outcome severity and risk of failing), the number of devices that need to be maintained according to manufacturer recommendations can be reduced even further than is achieved with just a basic Phase 1 AEM program (see Figure 4.1 AEM eligibility based on outcome severity of failure). This more comprehensive analysis divides the various manufacturer-model versions of each of the different device types into seven categories of PM risk, ranging from high PM risk to zero PM risk. Using three levels of likelihood that the device will fail from a PM-preventable cause (quite likely, unlikely, and very unlikely) increases the number of categories of PM-related risk from the five achieved with a Phase 1 AEM program to the seven seen in the left column of Figure 4.2 AEM eligibility based on outcome severity and risk of failing.

4.2 Determining a device's PM-related reliability

The basic measure of the likelihood that a device will fail from a PM-preventable cause (its PM-related reliability) is the frequency with which PM-preventable device failures are encountered during everyday use. In addition to noting the frequency of PM-preventable total failures of the devices during everyday use, tallying the frequency with which hidden failures are discovered during routine PM inspections is equally important. The failure count should also include hidden failures such as when a device does not pass one or more critical, manufacturer-recommended performance or safety checks or when one or more critical, non-durable parts for which the manufacturer recommends restoration are found to be already past their optimum restoration point. The Task Force’s recommended codes for these findings are described below in Section 4.7.

The PM-related reliability of each make-model version of a particular device type can be expressed as either its PM-related failure rate (i.e., how many PM-related failures are encountered during everyday use over a certain time period, including when PMs are performed) or as the corresponding mean time between failures (MTBF). Using the MTBF metric is preferred because the failure rates usually will be fractional, whereas the corresponding MTBF is a larger, more readily comprehended number.(See Section 1.1.1 in HTM ComDoc 1).

4.3 Acceptable levels of PM-related reliability

Although CMS ((HTM ComRef 28)) appears to accept the premise that device risk is a combination of the worst-case outcome severity of the device failure and the likelihood that such a failure will occur, no debate has appeared in the published literature about measuring PM-related reliability and, more importantly, about what levels should be considered acceptable, and what levels should be considered unacceptable.

Based on a relatively small amount of initial data collected for just a few manufacturer-models of defibrillators (Table_5.4_Defibrillator/_monitors), the Task Force has set an initial placeholder for the threshold for an acceptable level of PM-related reliability for potential PM-critical devices, such as defibrillators, at not more than one failure every 75 years. In other words, if a particular manufacturer-model defibrillator demonstrates that it experiences a PM-preventable failure no more frequently than once every 75 years, if should be considered sufficiently reliable to be included in an AEM program. Defibrillators are in the category of devices that potentially have the most serious (LOS 3) adverse outcomes when they fail. The Task force believes that it is reasonable to set the thresholds for devices with less serious levels of adverse outcome severity at somewhat lower levels. Accordingly, we have set the MTBF threshold placeholder for devices with less serious (LOS 2) levels of outcome severity at not more than one failure every 50 years and, for devices with even less serious (LOS 1) levels of outcome severity, at not more than one failure every 25 years. (See Table 4.1 Defining three levels of PM-related reliability (i.e. the device's likelihood of failing), below).

PM-related reliability

(device's likelihood of failing
from a PM-preventable cause)
For devices
Level of Severity
For devices
Level of Severity
For devices
Level of Severity
(device quite likely to fail)
< 75 yrs
< 50 yrs
< 25 yrs
(device unlikely to fail)
75-150 yrs
50-100 yrs
25-50 yrs
Very good
(device very unlikely to fail)
> 150 yrs
> 100 yrs
> 50 yrs

To define the seven levels of PM-preventable risk shown in Figure 4.2 AEM eligibility based on outcome severity and risk of failing, the Task Force used the three ranges shown in Table 4.1, above, for the likelihood (probability) of a device failing from a PM-preventable cause - namely quite likely, unlikely, and very unlikely. We also defined tentative ranges of MTBF values corresponding to each of those three levels (see Table 4.1 above). Implicit in these threshold values is the idea that the transition point between “likely” and “unlikely” for a critical (level of severity [LOS] 3) device is a value beyond which the “critical” device should be considered sufficiently reliable that it can be included in an AEM program. As noted below, in Section 4.4, the Task Force is planning to use actual maintenance data as a more rational basis for determining what the threshold levels should be.

4.4 What levels of acceptable PM-related reliability do the manufacturer’s recommendations imply?

It seems reasonable to presume that a device maintained according to its manufacturers recommendations will demonstrate a level of PM-related reliability that the manufacturer considers to be safe and acceptable. Further, because the Food and Drug Administration (FDA) has approved the device as safe and effective, it also seems reasonable to assert that the FDA has tacitly approved this same level. Therefore, the Task Force is planning to explore what actual levels are found for various devices maintained according to their manufacturer-recommended procedures.

We expect to find that the actual levels will vary over a range. If the range is broad, we propose to adopt either the average value or an average that is weighted according to the relative amounts of data in the sampled experience bases.

4.5 Communitywide database needed

Collecting sufficient data to provide a statistically meaningful body of evidence to support the use of particular alternate maintenance strategies may prove difficult for many individual healthcare facilities, for the following reasons:

  • Because they are designed and constructed by different entities, different manufacturer-model versions of devices with the most severe (LOS 3) outcomes (e.g., defibrillators, critical care ventilators) will likely display different levels of reliability. This means that the maintenance findings for each manufacturer-model version of these device types will need to be analyzed separately.
  • Devices that have the most severe (LOS 3) outcomes are presumably designed to be very reliable; therefore, they will likely demonstrate a correspondingly low PM-related failure rate. This anticipated high reliability will reduce the number of failures that an individual facility will be able to document over a reasonable time period.
  • Many healthcare facilities will have only a small number of different manufacturer-model versions of the device types that have the most severe (LOS 3) outcomes.

To illustrate this quandary, suppose that a facility has three similar (same manufacturer, same model) heart-lung units and only three years of maintenance history for each unit. This amounts to an experience base of only nine device-years. If the actual PM-related MTBF of the units is greater than nine years, then the facility may not have experienced even one PM-preventable failure during the three-year observation period. (The Task Force expects to find that the PM-related MTBF values for typical high-reliability devices will be at least 75 years.)

In this case, the facility would have to report its finding with respect to the devices’ indicated failure rate (zero failures experienced during the nine device-years of exposure) as “undetermined.” Even if the devices experienced one or more failures during this relatively short exposure, the indicated MTBF (reported as “up to nine years”) will appear to be unacceptably short for a device that is potentially a high PM risk (PM Priority 1) device. With an indicated MTBF this low, it would be prudent for the facility to look at the PM-related reliability for this device type in the database on the MPTF website to determine whether or not its experience is typical. For more on this possible situation, see Ridgway and Lipschultz (HTM ComRef 15) and Ridgway and Fennigkoh (HTM ComRef 16).

The bottom line is that many individual facilities will have difficulty generating enough failure data to get a good indication of each device’s true PM-related failure rate and, therefore, the device’s true level of PM-related safety. To get accurate measures of the true PM-related failure rate of PM Priority 1 devices, creating a pool of maintenance statistics containing a minimum number of device-years of experience for each manufacturer-model of each device type will be necessary.

The Task Force has selected 50 device-years as a reasonable benchmark for the minimum amount of maintenance-related failure data needed in the experience base to properly characterize the PM-related reliability of each particular device. Of course, more data is always better (see Table 4.2 Relationship between the strength of the evidence and the amount of maintenance data in the experience base, below - copied from Table 11).

Strength of the evidence Amount of maintenance data
in the
experience base
(in device-yrs)
Inadequate <50
Good 50-200
Very good 200-500
Substantial >500

4.6 Aggregating the data

We are continuing to appeal to the healthcare technology management (HTM) community to provide the Task Force with summaries of findings from the ongoing maintenance of devices that have been classified as potential PM Priority 1. To allow the findings to be properly aggregated, the maintenance, testing, and reporting should be performed in accordance with the following standardization guidelines (see also New Welcome Package for Data Aggregators):

  • For all potential PM Priority 1 device types, the maintenance entity must use a manufacturer-recommended PM procedure or one that includes, at minimum, all of the device restoration and safety verification tasks listed in the manufacturer’s procedure.
  • Although regulatory constraints exist, for the purpose of this project, it is not necessary for the maintenance entity to perform the PM tasks at the same interval as that recommended by the manufacturer. In the absence of regulatory mandates, diversity is welcome because one of the goals of the project is to compare levels of PM-related device reliability achieved at different maintenance intervals.
  • The maintenance entity must use some form of repair call coding similar to that described in Ridgway et al. (HTM ComRef 8) and in Section 1.4 of HTM ComDoc 1. This will allow a separate count of the failures that are judged to be PM preventable.
  • The maintenance entity must also use some form of coding for the PM findings similar to that described in Section 4.7 below. This will allow a separate count of the number of times that a hidden failure was detected (PM code F), as well as the number of times that a nondurable part was found to have deteriorated beyond the optimum (PM code 9).

4.7 Preferred system for coding repair calls and PM findings

Equipment systems fail for a variety of reasons, and recognizing that only a few of these failures can be prevented by periodic maintenance is important. Ridgway et al. (HTM ComRef 28) point out that equipment failures can be classified into three general types depending on which part of the equipment system has failed. For a more detailed description of this repair call coding system, see Section 1.3 “What are the causes of medical device failures?” in HTM ComDoc 1.

For PM findings, the MPTF recommends the following codes:

  • PM code A (passed). Safety verification testing to detect hidden failures found the device to be in complete compliance with the relevant specifications, and any other functions tested were all within expectations.
  • PM code B (minor out-of-spec [OOS] condition[s] found). One or more of the tests revealed a slightly OOS condition. The purpose of this rating is to create a watch list to monitor for future adverse trends (particularly performance or safety failures), even though the discrepancy is not considered to be significant at present. A PM code B finding is considered a passing grade.
  • PM code F (failed). One or more of the tests found that one or more of the device’s performance or safety features was considerably OOS. This is a failing grade, and if this is a PM priority 1 device, it should be removed from service immediately.

The service person also should indicate (by circling one of four numbers [1, 5, 9, or 0]) whether the physical condition of any parts of the device that were restored (as called for in the procedure) were:

  • PM code 1 (still good/better than expected). Restored parts showed little or no deterioration.
  • PM code 5 (about as expected). Minor deterioration was observed, but it probably was not affecting the device’s function adversely.
  • PM code 9 (already worn out/serious physical deterioration). One or more of the restored parts was found to be considerably worse than expected. They were worn out and probably having an adverse effect on the device’s function.
  • PM code 0 (no physical restoration required). The device has no parts requiring physical restoration.

Systematically documenting these findings each time a PM is performed, and then aggregating the data, will make it possible to obtain two important pieces of information:

1) An indication of how well the PM interval matches the optimum. The optimum PM interval is when the parts being restored have deteriorated but not to the point where the deterioration has started to affect the functioning of the device. The indicators for how close the interval is to this optimum are as follows. A preponderance of:

  • PM code 1 findings (still very good) is an indicator that the interval is too short.
  • PM code 5 findings (about as expected) is an indicator that the interval is about right.
  • PM code 9 findings (already worn out) is an indicator that the interval is too long.

2) A numerical MTBF indicating the device’s level of PM-related reliability. This indicator is the lesser of the following MTBF values (representing the lower level of PM-related reliability):

  • The MTBF based on the total of any overt failures caused by inadequate device restoration (from the repair cause coding) and any PM code 9 findings (which are immediate precursors of the overt failures caused by inadequate restoration).
  • The MTBF based on the total of any hidden performance and safety degradations detected by the safety verification tasks (PM code F findings).

4.8 Compiling data into organized batches

To streamline the reporting, the Task Force will be asking certain organizations to volunteer to act as data-aggregating intermediaries. Organizations that are candidates for this data aggregator role include independent service organizations, national or regional hospital systems with in-house maintenance services, and computerized maintenance management system companies.

For additional information, see Section 7.5 “Guidelines for compiling the data into organized batches” in HTM ComDoc 7. and the New Welcome Package for Data Aggregators)

4.9 Key database tables

The summary proof tables (Table 5) are the most important part of the community database. These are numbered as subsidiary tables grouped under Table 5. Each table catalogs PM-related failure rates calculated from aggregated maintenance data submitted for each of the potential PM priority 1 device types.

The tables display the accumulated data for each device and the MTBF for the PM-related failure rate. These data were derived by totaling the number of reported overt failures that were judged to be PM preventable and the number of PM code 9 failures (which are immediate precursors of overt failures) found during the reporting period.

Generally speaking, all devices will exhibit different levels of PM-related reliability and an associated level of PM-related risk when maintained at different intervals. Devices that exhibit an unacceptably high risk of an adverse outcome when they fail from a PM-preventable failure will usually exhibit a lower, more acceptable level of risk when the PM interval is reduced. After this information becomes available on the website, guessing at what would be a “safe” PM interval for any particular device will no longer be necessary. The answer will be apparent from the numbers in the summary proof tables (Table 5). In time, the results will show whether the manufacturer’s recommendations result in a fairly consistent level of PM-related reliability or if some appear to require adjustment (HTM ComRef 18).

Several of these issues, such as the thresholds for acceptability of the size of the experience base (Table 4.2) and what should be used as the acceptable values for PM-related reliability (Table 4.1), may require further deliberation from the MPTF.

4.10 Improving the efficiency of a medical equipment maintenance program

  • PM priority 1 devices with parts that the manufacturer indicates need periodic restoration.

These are potentially hazardous devices with either overt or hidden PM-preventable failures that could cause a life-threatening injury and that are demonstrating PM-related failure rates greater than the currently acceptable level (not more than one failure every 75 years). For these devices, it would be prudent to continue to follow the manufacturer-recommended PM procedure (for both the interval and the scope of the tasks) and to routinely monitor the levels of patient safety being achieved, as described in ... and HTM ComRef 35. This should be continued until acceptable evidence exists in the national database (Table 13) that some other procedure with more efficient tasks and/or a longer interval is found to demonstrate the same or better level of PM-related reliability or a comparable level of patient safety.

  • PM priority 1 devices with no parts that the manufacturer says need periodic restoration.

These are potentially hazardous devices with hidden PM-preventable failures capable of causing a life-threatening injury that are demonstrating PM-related failure rates greater than the currently acceptable level (not more than one failure every 75 years). For these devices, for which the only “maintenance” that the manufacturer recommends is periodic safety verification, it would be prudent to continue to follow the manufacturer-recommended safety verification testing schedule and routinely monitor the levels of patient safety being achieved, as described in ... and HTM ComRef 35, until evidence exists in the national database (Table 13) that testing at a longer interval results in the same or better level of PM-related reliability or a comparable level of patient safety.

When testing for possible hidden failures with potential high-severity outcomes, there is no optimum interval — shorter is always better. However, it has been shown (see ... ) that for safety verification–related (hidden) failures with MTBF values greater than about 50 years, the increase in the time that the patient would be exposed to potentially hazardous hidden failures if the testing interval was increased from six months to as long as five years is very small.

  • All PM priority 2–5 devices.

These lower PM-risk devices qualify for inclusion in an AEM program either because of the lower level of severity of the outcomes of potential failures or because they have demonstrated an acceptable level of PM-related reliability. Therefore, they can be maintained using a maintenance procedure or strategy other than that recommended by the manufacturer. They can be transitioned immediately to less stringent PM strategies, such as the cost-efficient light maintenance (run-to-failure) strategy - which is mentioned in Appendix A of the CMS memo (HTM ComRef 28). At the very least, the manufacturer-recommended procedures can be modified (such as by omitting electrical safety checks that the facility has found to be nonproductive), or by extending the testing interval to make it coincide with a more convenient or more efficient routine.

The logical rule here is to explore the national database (Table 13) for evidence of more efficient maintenance procedures. It would be prudent to monitor the levels of patient safety (as described in ... and HTM ComRef 35) being achieved by the current procedure (or any of the more efficient procedures, if chosen) for devices categorized as PM priority 2 (moderate PM-risk) devices. Monitoring those in the lower risk categories is much less important but can be undertaken if the facility chooses.

  • For all negligible or zero PM-risk devices.

If these devices should fail, there is a negligible or zero additional risk to patient safety. Therefore, in the absence of other regulatory mandates, unless there is a convincing case that periodic PM can be justified through lower maintenance costs, these devices are excellent candidates for the very efficient light maintenance (run-to-failure) strategy. It was by adopting this run-to-failure maintenance strategy in the early 1960s that the civil aviation industry was able to reduce its maintenance costs by 50% while, unexpectedly, also improving the reliability and safety statistics for civilian aircraft by a factor of 200 times.

4.11 Final Cautionary Note

Patient and staff safety has long been the primary justification in medical equipment maintenance programs for performing routine PM on the hospital’s frontline patient care equipment. Regular PM also has become a deeply rooted symbol of institutional caution and caring. After all, if the equipment doesn’t look well cared for, what does that imply about how well the organization takes care of its patients?

The intent of this effort on the part of the AAMI-sponsored Maintenance Practices Task Force is to address longstanding misunderstandings about how much regular PM contributes to keeping modern medical equipment safe. If this effort is accepted as a way to support a reduction in the amount of PM performed on low PM-risk equipment, we urge that careful thought be given to replacing those services with more efficient or less technically intensive alternative routines (e.g., department rounds) to ensure that clinical staff remain confident in the equipment and that it still looks well cared for and ready to do its job.

Site Toolbox:

Personal tools
This page was last modified 19:43, 6 October 2018. - This page has been accessed 3,228 times. - Disclaimers - About HTMcommunityDB.org