Skip to content

Voices for Safer Care

Insights from the Armstrong Institute

Voices for Safer Care Home Measurement of Safety and Quality Health Care Shouldn’t Judge Itself by Flawed Tests

Health Care Shouldn’t Judge Itself by Flawed Tests

As standardized exam scores increasingly define success for students, teachers and schools, parents worry about the dangers of “teaching to the test”—and of their children being judged by tests with low or unknown validity. We want our children to perform well on tests, of course, yet only if they measure something that students, patients and teachers believe really matter. We also want the education system to inspire students develop into well-rounded people, not just skilled exam-takers.

In health care there is a similar danger of focusing on improving our “test scores” at the expense of real improvement in patient safety—and in this case, the exams have serious flaws. The federal government uses a composite measure of patient safety to help determine whether hospitals are penalized under two programs. One of those programs, the Hospital-Acquired Conditions Program, in December reduced Medicare reimbursements by 1 percent for 721 hospitals for their rates of preventable harms, such as serious blood clots, pressure ulcers, and accidental punctures and lacerations.

Serves them right, you might think. These hospitals are unnecessarily harming patients. That might be true if the test of their patient safety performance was scientifically sound. However, these programs have a serious methodological flaw: Many of their component measures are not based on reviews of the clinical record, but are rather derived from billing information, which produces a high rate of false positives. Indeed, for some of these measures, more than half of the incidents identified as preventable harm turn out to be false, once we review the clinical documentation. There can be many reasons for this. For instance, a patient may have had a pressure ulcer before admission that was not documented. Or a clot in a small vein might be mistakenly coded as a more serious clot known as a deep vein thrombosis.

Another reason for some hospitals to appear worse than others is that they actually look harder for preventable harm. For example, researchers found that publicly reported rates of blood clots were mainly a function of how aggressively hospitals screened patients for them. In other words, a hospital might look better if it did not screen patients, and yet that hospital might be sending patients home with undiagnosed and potentially fatal blood clots.

Despite these flaws, hospitals have no choice but to pay attention to their performance on these measures, or they risk losing a lot of revenue. In Maryland, hospitals have up to 9 percent of revenue at risk in pay for quality, when the average hospital margin is under 3 percent. This is big money to be at stake for suspect measures.

Hospitals with deep enough resources will devote them to improving their scores. For example, our hospital committed multiple full-time employees, plus the time of physicians, nurses and administrators, to raise our performance on these measures. This is highly technical work that frequently involves getting physicians to more thoroughly document their cases, so that medical billing staff can code more accurately. And we succeeded, reducing the frequency of these preventable harms by 40 percent, at least according to these measures. Yet what we improved was mostly coding, not caring. Well over 80 percent of the improvement was in how we document and code. In fewer than 20 percent of the cases did we identify a clinical improvement opportunity.

Hospitals and health care professionals want and need to do better, and yet focusing our efforts on measures whose validity is unknown or poor is not how we should spend our efforts. The public deserves real strides in preventing patient harm. Without valid measures, we will have difficulty engaging clinicians and hospitals in improvement work, providing financial incentives for truly high-quality care, or helping patients to use performance data to make choices about where to receive care.


Peter Pronovost

One of the world’s leading authorities on patient safety, Peter Pronovost served a the director of the Armstrong Institute, as well as senior vice president for patient safety and quality, at Johns Hopkins Medicine from 2011 until January 2018.

3 thoughts on “Health Care Shouldn’t Judge Itself by Flawed Tests”

  1. Your discussion highlights the one of the well known consequences of any performance measurement system - unintended consequences. In 2006, Peter Lindenauer highlighted a framework for classifying unintended consequences into "direct harm" and "indirect harm" ( Direct harm may occur when healthcare providers do inappropriate things to achieve higher performance rates on quality measures (for example, rapid delivery of antibiotics to patients in the emergency department who may have pneumonia before the diagnosis has been confirmed). Indirect harm, which probably occurs more commonly, is often characterized by the diversion of resources to those conditions or metrics for which there are payment incentives - teaching to the test. For example, until recently hospitals all over the country continued to devote substantial resources to collecting core process of care measure data on heart attack, pneumonia, and heart failure patient records even though there was little room for improvement. The measures may not have been flawed but the resources required to collect and report data on the metrics were substantial and were not likely to result in much improvement in patient clinical outcomes.

    I am convinced that there can be unintended consequences of almost any performance measurement system when the focus of healthcare providers is driven primarily by the financial incentives related to the measure - not the clinical opportunities for improvement. An example was recently highlighted by the Healthcare Infection Control Practices Advisory Committee (HICPAC) about the likely unintended consequences of clinician veto and clinical adjudication panels in hospitals for determination of whether healthcare-associated infections should be submitted to the CDC's National Healthcare Safety Network (NHSN) now that they are being used for public reporting and payment penalties by CMS (Talbot, et al. Ann Intern Med. 2013;159:631-635.).

    There is little question that the accountability tied to performance measurement (public reporting, payment incentives or penalties) has accelerated improvement on a host of nationally standardized measures of quality. The National Quality Forum does include the potential for unintended consequences during their consensus development process review of performance measures. While I agree that hospitals (and other healthcare providers) have no choice but to pay attention to these national performance metrics that are used for accountability, we need to keep the primary focus of our efforts on improving the clinical quality of care opportunities that are identified, and we need to remain active participants in the national discussions related to the development and implementation of performance measures that are used for accountability.

    1. Dale: Those are great points. We must be mindful of unintended consequences and find the balance between learning and judging. A key to achieving this is developing better measures -- those that truly assess important aspects of health care quality. If we have such measures, there will be little opportunity for gaming the system and they will provide timely feedback to clinicians.
      Thank you for your thoughtful reply.

  2. Pingback: Health Care Shouldn’t Judge Itself by Flawed Tests | Doctella

Leave a Reply

Your email address will not be published. Required fields are marked *