Hospital Quality Data: CMS Needs More Rigorous Methods to Ensure
Reliability of Publicly Released Data (31-JAN-06, GAO-06-54).
The Medicare Modernization Act of 2003 directed that hospitals
lose 0.4 percent of their Medicare payment update if they do not
submit clinical data for both Medicare and non-Medicare patients
needed to calculate hospital performance on 10 quality measures.
The Centers for Medicare & Medicaid Services (CMS) instituted the
Annual Payment Update (APU) program to collect these data from
hospitals and report their rates on the measures on its Hospital
Compare Web site. For hospital quality data to be useful to
patients and other users, they need to be reliable, that is,
accurate and complete. GAO was asked to (1) describe the
processes CMS uses to ensure the accuracy and completeness of
data submitted for the APU program, (2) analyze the results of
CMS's audit of the accuracy of data from the program's first two
calendar quarters, and (3) describe processes used by seven other
organizations that assess the accuracy and completeness of
clinical performance data.
-------------------------Indexing Terms-------------------------
REPORTNUM: GAO-06-54
ACCNO: A46078
TITLE: Hospital Quality Data: CMS Needs More Rigorous Methods to
Ensure Reliability of Publicly Released Data
DATE: 01/31/2006
SUBJECT: Data collection
Data integrity
Hospitals
Medical records
Performance measures
Quality assurance
Quality control
Reporting requirements
Statistical data
Program implementation
Annual Payment Update Program
******************************************************************
** This file contains an ASCII representation of the text of a **
** GAO Product. **
** **
** No attempt has been made to display graphic images, although **
** figure captions are reproduced. Tables are included, but **
** may not resemble those in the printed version. **
** **
** Please see the PDF (Portable Document Format) file, when **
** available, for a complete electronic file of the printed **
** document's contents. **
** **
******************************************************************
GAO-06-54
* Results in Brief
* Background
* Selection of Measures
* Collection, Submission, and Reporting of Quality Data
* Implementation of the APU Program
* Other Reporting Systems
* CMS Has Processes for Checking Data Accuracy but Has No Ongo
* CMS Checks Data Accuracy Electronically and Through an Indep
* CMS Has No Ongoing Process to Ensure Completeness of Data Su
* Data Accuracy Baseline Was High Overall, but Statistically U
* Baseline Level of Data Accuracy Was High Overall, and Large
* Passing the 80 Percent Threshold Is Statistically Uncertain
* No Data Were Available to Provide Baseline Assessment of Com
* Other Reporting Systems Use Various Methods to Ensure Data A
* Other Reporting Systems Use Various Methods to Check Data
* Other Reporting Systems Conduct Independent Audits
* Conclusions
* Recommendations for Executive Action
* Agency Comments
* GAO Contact
* Acknowledgments
* GAO's Mission
* Obtaining Copies of GAO Reports and Testimony
* Order by Mail or Phone
* To Report Fraud, Waste, and Abuse in Federal Programs
* Congressional Relations
* Public Affairs
Report to the Committee on Finance, U.S. Senate
United States Government Accountability Office
GAO
January 2006
HOSPITAL QUALITY DATA
CMS Needs More Rigorous Methods to Ensure Reliability of Publicly Released
Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data Reliability of Hospital Quality Data
Reliability of Hospital Quality Data
GAO-06-54
Contents
Letter 1
Results in Brief 5
Background 7
CMS Has Processes for Checking Data Accuracy but Has No Ongoing Process to
Check Completeness 13
Data Accuracy Baseline Was High Overall, but Statistically Uncertain for
Many Hospitals, and Data Completeness Baseline Cannot Be Determined 21
Other Reporting Systems Use Various Methods to Ensure Data Accuracy and
Completeness, Notably an Independent Audit 28
Conclusions 30
Recommendations for Executive Action 31
Agency Comments 32
Appendix I Scope and Methodology 35
Appendix II Other Reporting Systems 41
Appendix III Data Tables on Hospital Accuracy Scores 47
Appendix IV Comments from the Centers for Medicare & Medicaid Services 53
Appendix V GAO Contact and Staff Acknowledgments 58
Tables
Table 1: HQA Hospital Quality Measures 9
Table 2: Percentage and Number of Hospitals Whose Baseline Accuracy Score
Met or Fell Below the 80 Percent Threshold, by Measure Set and Quarter 23
Table 3: Background Information on CMS and Other Reporting Systems 41
Table 4: Processes Used by CMS and Other Reporting Systems to Ensure Data
Accuracy 43
Table 5: Processes Used by CMS and Other Reporting Systems to Ensure Data
Completeness 45
Table 6: Median Hospital Baseline Accuracy Scores, by Hospital
Characteristic, Quarter, and Measure Set 47
Table 7: Proportion of Hospitals with Baseline Accuracy Scores Not Meeting
80 Percent Threshold, by Hospital Characteristic, Quarter, and Measure Set
48
Table 8: Percentage of Hospitals with Baseline Accuracy Scores Not Meeting
80 Percent Threshold, by JCAHO-Certified Vendor Grouped by Number of
Hospitals Served, Quarter, and Measure Set 49
Table 9: Breadth of Confidence Intervals in Percentage Points Around the
Hospital Baseline Accuracy Scores at Selected Percentiles, by Measure Set
and Quarter 51
Table 10: For Hospitals with Confidence Intervals That Included the 80
Percent Threshold, Percentage of Total Hospitals with an Actual Baseline
Accuracy Score That Either Met or Failed to Meet the Threshold, by Measure
Set and Quarter 52
Figures
Figure 1: Approximate Times for Collection, Submission, and Reporting of
Hospital Quality Data 12
Figure 2: Baseline Hospital Accuracy Scores at Selected Percentiles, by
Measure Set and Quarter 22
Figure 3: Percentage of Hospitals Whose Baseline Accuracy Score Confidence
Intervals Clearly Exceed, Fall Below, or Include the 80 Percent Threshold,
by Measure Set and Quarter 26
Abbreviations
ACC American College of Cardiology
AMI acute myocardial infarction
APU program Annual Payment Update program
CABG coronary artery bypass grafting
CAP community-acquired pneumonia
CDAC Clinical Data Abstraction Center
CMS Centers for Medicare & Medicaid Services
DAVE Data Assessment and Verification Project
HF heart failure
HQA Hospital Quality Alliance
IFMC Iowa Foundation for Medical Care
JCAHO Joint Commission on Accreditation of Healthcare Organizations
MDS Minimum Data Set
MEDPAR Medicare Provider Analysis and Review
MMA Medicare Prescription Drug, Improvement, and Modernization Act
MSA metropolitan statistical area
NCQA National Committee for Quality Assurance
PCI percutaneous coronary intervention
PTCA percutaneous transluminal coronary angioplasty
QIO quality improvement organization
SPARCS Statewide Planning and Research Cooperative System
SSA Social Security Administration
STS Society of Thoracic Surgeons
This is a work of the U.S. government and is not subject to copyright
protection in the United States. It may be reproduced and distributed in
its entirety without further permission from GAO. However, because this
work may contain copyrighted images or other material, permission from the
copyright holder may be necessary if you wish to reproduce this material
separately.
United States Government Accountability Office
Washington, DC 20548
January 31, 2006
The Honorable Charles E. Grassley Chairman The Honorable Max Baucus
Ranking Minority Member Committee on Finance United States Senate
The Medicare Prescription Drug, Improvement, and Modernization Act (MMA)
of 2003 created a financial incentive for hospitals to submit data to
provide information about their quality of care that could be publicly
reported.1 Under Section 501(b) of MMA, acute care hospitals shall submit
the clinical data from the medical records of all Medicare and
non-Medicare patients needed to calculate hospitals' performance on 10
quality measures. If a hospital chooses not to submit the data, it will
lose 0.4 percent of its annual payment update from Medicare for a
subsequent fiscal year.2 The Centers for Medicare & Medicaid Services
(CMS) established the Annual Payment Update program (APU program)3 to
implement this provision of MMA. Participating hospitals submit quality
data that are used to calculate a hospital's performance on the measures
quarterly,4 according to a schedule defined by CMS. MMA affects hospital
annual payment updates for fiscal year 2005 through fiscal year 2007.5 For
fiscal year 2005, the first year of the program, CMS based its annual
payment update on quality data submitted by hospitals for patients
discharged between January 1, 2004, and March 31, 2004.
1Pub. L. No. 108-173, S: 501(b), 117 Stat. 2066, 2289-90 (amending section
1886(b)(3)(B) of the Social Security Act, to be codified at 42 U.S.C. S:
1395ww(b)(3)(B)).
2The reduction in the annual payment update applies to hospitals paid
under Medicare's inpatient prospective payment system. Critical access,
children's, rehabilitation, psychiatric, and long-term-care hospitals may
elect to submit data for any of the measures, but they are not subject to
a reduction in their payment if they choose not to submit data.
3Throughout this report, we refer to CMS's Reporting Hospital Quality Data
for the Annual Payment Update program as the "APU program".
4Throughout this report, we refer to the clinical data submitted by
hospitals that are used to calculate their performance on the measures as
"quality data".
5Senate Bill 1932 would extend the APU program indefinitely. It would also
increase the penalty for not submitting data to 2 percent and provide for
the Secretary to establish additional measures, beyond the original 10,
for payment purposes.
Under MMA, the 10 quality measures for which hospitals report data are
those established by the Secretary of Health and Human Services as of
November 1, 2003. The measures cover three conditions: heart attack, heart
failure, and pneumonia. Over 3 million patients were admitted to acute
care hospitals in 2002 with these three conditions, representing
approximately 10 percent of total acute care hospital admissions. For
patients over 65, acute care hospital admissions for the three conditions
represented approximately 16 percent of total admissions.
The collection of quality data on the 10 measures is part of a larger
initiative to provide useful and valid information about hospital quality
to the public.6 In April 2005, CMS launched a Web site called "Hospital
Compare" to convey information on these and other hospital quality
measures to consumers. Additional measures are being introduced by CMS,7
and it is expected that public reporting of hospital quality measures will
continue into the future. Hospitals may submit quality data on additional
measures for the APU program, but CMS bases any reduction in the annual
payment update on the 10 measures referenced in the MMA. In addition to
this effort, other public and private organizations also administer
reporting systems in which clinical data are collected and may be released
to the public.
In order for publicly released information on the hospital quality
measures to be useful to patients, payers, health professionals, health
care organizations, regulators, and other users, the quality data used to
calculate a hospital's performance on the measures need to be reliable,
that is, both accurate and complete. If a hospital submits complete data,
that is, data on all the cases that meet the specific inclusion criteria
for eligible patients, but the data are not collected, or abstracted, from
the patients' medical records accurately, the data will not be reliable.
Similarly, if a hospital submits accurate data, but those data are
incomplete because the hospital leaves out eligible cases, the data will
not be reliable. Data that are not reliable may present a risk to people
making decisions based on the data, such as a patient choosing a hospital
for treatment. The program's initial, or baseline, data could describe
data reliability at the start of the program and provide a reference point
for any subsequent assessments.
6According to the Secretary of Health and Human Services, the effort is
also intended to provide hospitals with a sense of predictability about
public reporting expectations, to standardize data and data collection
mechanisms, and to foster hospital quality improvement, in addition to
providing information on hospital quality to the public.
7For example, CMS plans to publicly report on the Hospital Compare Web
site measures of patient perspectives on seven aspects of hospital care,
with national implementation scheduled for 2006.
You asked us to provide information on the reliability of publicly
reported information on hospital quality obtained through the APU program.
In this report, we (1) describe the processes CMS uses to ensure that the
quality data submitted by hospitals for the APU program are accurate and
complete and any plans by CMS to modify its processes; (2) determine the
baseline levels of accuracy and completeness for the data for patients
discharged from January 2004 through June 2004, the first two quarters of
data submitted by hospitals under the APU program; and (3) describe the
processes used by seven other organizations that collect clinical
performance data to assess the accuracy and completeness of quality data
for selected reporting systems.
In addressing these objectives, we collected information through
interviews, examination of documents, and data analysis. To describe CMS's
processes for ensuring the accuracy and completeness of the quality data
for the APU program, we interviewed program officials from CMS and its
contractors,8 hospital associations, quality improvement organizations
(QIO), and hospital data vendors.9 In addition, we examined both publicly
available and internal documents from CMS and its contractors. To
determine the baseline accuracy and completeness of data submitted for the
APU program, we drew on available information collected by CMS. In
particular, we analyzed the accuracy of the quality data based on the
reabstraction of patient medical records performed by CMS's Clinical Data
Abstraction Center (CDAC).10 The reabstraction results available at the
time we conducted our analyses pertained to hospital discharges that took
place from January 1, 2004, through June 30, 2004.11 We extracted
additional information about hospitals from the Medicare Provider of
Services database, including the number of Medicare-certified beds and
urban or rural location. After examining the CDAC data and reviewing the
procedures that CMS has put in place to conduct the reabstraction process,
we determined that the data were sufficiently reliable to use in
estimating the baseline level of accuracy characterizing the quality data
submitted by hospitals for those two calendar quarters. Regarding data on
completeness of the quality data, we interviewed CMS officials and
contractors and examined related documents. To examine the methods used by
other reporting systems12 to assess data completeness and accuracy, we
conducted structured interviews with officials from seven organizations,13
including government agencies, that administer such systems. We focused on
reporting systems that collect clinical rather than administrative data.
We selected a mix of systems, in terms of public or private sponsorship,
types of providers assessed, and medical conditions covered, to ensure
variety. We also spoke with individual health professionals with expert
knowledge in the field of hospital quality assessment.
8CMS's contractors for this program are the Iowa Foundation for Medical
Care (IFMC) and DynKePRO, LLC. IFMC is the quality improvement
organization (QIO) for the state of Iowa. (QIOs are independent
organizations that work under contract to CMS to monitor quality of care
for the Medicare program and help providers to improve their clinical
practices.) Under a separate contract, IFMC operates the national database
for hospital quality data known as the QIO clinical warehouse. DynKePRO,
LLC, an independent medical auditing firm, operates CMS's Clinical Data
Abstraction Center (CDAC), which assesses the accuracy of hospital data
submissions.
9Some hospitals contract with data vendors to electronically process,
analyze, and transmit patient information.
Our analysis of the level of accuracy and completeness of the quality data
is based on the procedures developed by CMS to validate the data
submitted; we have not independently compared the data submitted by
hospitals to the original patient clinical records. In addition, we did
not assess the performance of hospitals with respect to the quality
measures themselves (which show how often the hospitals provided a
specified service or treatment when appropriate). We conducted our work
from November 2004 through January 2006 in accordance with generally
accepted government auditing standards. For more details on our scope and
methodology, see appendix I.
10Reabstraction is the re-collection of clinical data for the purpose of
assessing the accuracy of hospital abstractions. In the APU program, CDAC
compares data originally submitted by the hospitals to those it has
reabstracted from the same medical records.
11These were the calendar quarters for which, at the time we conducted our
analysis, hospitals had collected the data and CMS had completed its
process for reabstracting and assessing the data. We analyzed data for all
hospitals affected by section 501(b) of MMA, which were located in 49
states and the District of Columbia. Hospitals in Maryland and Puerto Rico
were excluded because they are paid under different payment systems than
other acute care hospitals.
12Throughout this report, we refer to this group of quality data reporting
systems, each of which collects some type of clinical performance data
from designated providers or health plans, as "other reporting systems".
13The seven organizations were the American College of Cardiology, the
California Office of Statewide Health Planning and Development, CMS (the
units responsible for monitoring nursing home care regarding the Data
Assessment and Verification Project contract), the Joint Commission on
Accreditation of Healthcare Organizations (JCAHO), the National Committee
for Quality Assurance, the New York State Department of Health, and the
Society of Thoracic Surgeons.
Results in Brief
CMS has processes for ensuring the accuracy of the quality data submitted
by hospitals for the APU program, but has no ongoing process for assessing
the completeness of those data. To check accuracy, one CMS contractor
electronically checks the data as they are submitted to the clinical
warehouse, and another operates CMS's CDAC that conducts an independent
audit by sampling five patient record abstractions from all the quality
data submitted by each hospital in a quarter. CDAC then compares the
quality data originally collected by the hospital from the medical records
for those five patients to the quality data it has reabstracted from the
same medical records. The data are deemed to be accurate if there is 80
percent or greater agreement between these two sets of results. CMS did
not require hospitals to meet the 80 percent threshold for the 10 APU
measures to receive their full annual payment update for fiscal year 2005.
However, for fiscal year 2006, CMS reduced the payment update by 0.4
percentage points for hospitals whose data on the APU measures do not meet
the 80 percent threshold. To assess completeness, CMS has twice compared
the number of cases submitted by each hospital for the APU program for a
given period to the number of claims each hospital submitted to Medicare,
once for the fiscal year 2005 update and once for the fiscal year 2006
update. However, these analyses did not address non-Medicare patient
records, and the approach that CMS took in these analyses was not capable
of detecting incomplete data for all hospitals. For example, to determine
which hospitals could receive the full fiscal year 2006 update, CMS
limited its analysis to hospitals that submitted no patient data at all to
the clinical warehouse in a given quarter. CMS has not put in place an
ongoing process for checking the completeness of the data that hospitals
submit for the APU program that would provide accurate and consistent
information for all patients and all hospitals. Nor has CMS required
hospitals to certify that they submitted data for all eligible patients or
a representative sample thereof.
We could determine a baseline level of accuracy for the quality data
submitted by hospitals for the APU program but not a baseline level of
completeness. We found a high overall baseline level of accuracy when we
examined CMS's assessment of the data from the first two calendar quarters
of 2004. Overall, the median accuracy score exceeded 90 percent, which was
well above the 80 percent accuracy threshold set by CMS, and about 90
percent of hospitals met or exceeded that threshold for both the first and
the second calendar quarters of 2004. For most hospitals whose accuracy
score was well above the threshold, the results based on the reabstraction
of five cases were statistically certain. However, for approximately
one-fourth to one-third of all the hospitals that CMS assessed for
accuracy, the statistical margin of error for their accuracy score
included both passing and failing accuracy levels. Consequently, for these
hospitals, the five cases that CMS examined were not sufficient to
establish with statistical certainty whether the hospital met the
threshold level of data accuracy. Accuracy did not vary between rural and
urban hospitals, and small hospitals provided data as accurate as those
from larger hospitals. The completeness baseline could not be determined
because CMS did not assess the extent to which all hospitals submitted
data on all eligible patients, or a representative sample thereof, for the
first two calendar quarters of 2004, and consequently there were no data
from which to derive such an assessment.
Other reporting systems that collect clinical performance data have
adopted various methods to ensure data accuracy and completeness. Some of
these methods are used by all of these other reporting systems, such as
checking the data electronically to identify missing data. Officials from
some of the other systems and an expert in the field stressed the
importance of including an independent audit in the methods used by
organizations to check data accuracy and completeness. Most other
reporting systems that conduct independent audits incorporate three
methods as part of their process that CMS does not use in its independent
audit. Specifically, most include an on-site visit, focus their audits on
a selected number of facilities or reporting entities, and review a
minimum of 50 patient medical records per reporting entity during the
audit.
In order for CMS to ensure that the hospital quality data are accurate and
complete, we recommend that the CMS Administrator, focusing on the subset
of hospitals for which it is statistically uncertain if they met CMS's
accuracy threshold in one or more previous quarters, increase the number
of patient records reabstracted by CDAC. We further recommend that CMS
require hospitals to certify that they took steps to ensure that they
submitted data on all eligible patients, or a representative sample
thereof, and that the agency assess the level of incomplete data submitted
by hospitals for the APU program to determine the magnitude of
underreporting, if any, in order to refine how completeness assessments
may be done in future reporting efforts. In commenting on a draft of this
report, CMS agreed to implement steps to improve the quality and
completeness of the data.
Background
Medicare spends over $136 billion annually on inpatient hospital care for
its beneficiaries. To help ensure the quality of the care it purchases
through Medicare, CMS launched the Hospital Quality Initiative in 2003.
This initiative aims to refine and standardize hospital data, data
transmission, and performance measures as part of an effort to stimulate
and support significant improvement in the quality of hospital care.
One component of this broader initiative is CMS's participation in the
Hospital Quality Alliance (HQA), a public-private collaboration that seeks
to make hospital performance information more accessible to the public,
payers, and providers of care.14 Before the enactment of MMA, HQA had
organized a voluntary program for hospitals to submit data on quality of
care measures intended for public reporting. For its part as a participant
in HQA, CMS set up a central database to receive the data submitted by
hospitals and initiated plans for a Web site to post information on
hospital quality of care measures. Thus, CMS had a data collection
infrastructure in place when MMA established the financial incentive for
hospitals to submit quality data.
Selection of Measures
The 10 measures chosen by the Secretary of Health and Human Services for
the APU program are the original 10 measures that were adopted by HQA. HQA
subsequently adopted additional measures that relate to the same three
conditions-heart attacks, heart failure, and pneumonia-and others that
relate to surgical infection prevention. (See table 1 for a listing of the
APU-measure set and the expanded-measure set.15) Hospitals participating
in HQA were encouraged to submit data on the additional measures, but data
submitted on the additional measures did not affect whether a hospital
received its full payment update under the APU program. CMS and the QIOs
have tested these measures for validity and reliability, and all measures
have been endorsed by the National Quality Forum, which fosters agreement
on national standards for measurement and public reporting of health care
performance data.16
14HQA (formerly called the National Voluntary Hospital Reporting
Initiative) was initiated by the American Hospital Association, the
Federation of American Hospitals, and the Association of American Medical
Colleges. It is supported by CMS, as well as the Joint Commission on
Accreditation of Healthcare Organizations, National Quality Forum,
American Medical Association, Consumer-Purchaser Disclosure Project, AARP,
AFL-CIO, and Agency for Healthcare Research and Quality. Its aim is to
provide a single standard quality measure set for hospitals to support
public reporting and pay-for-performance efforts.
15Throughout this report, we refer to the 10 measures on which reductions
in the annual payment update are based as the "APU-measure set" and to the
combination of those 10 with the additional measures adopted by HQA as the
"expanded-measure set". HQA added 7 measures for discharges beginning
April 1, 2004, and another 5 measures for discharges beginning July 1,
2004, for a total of 22 measures on which hospitals may currently submit
data. Thus, the expanded-measure set includes different numbers of
measures for different quarters of data.
16The National Quality Forum is a voluntary standard-setting,
consensus-building organization representing providers, consumers,
purchasers, and researchers.
Table 1: HQA Hospital Quality Measures
Surgical
infection
Heart attack Heart failure Pneumonia prevention
APU-measure set
For 1. Aspirin at 6. Left 8. Initial (none)
discharges arrival ventricular antibiotic received
beginning function within 4 hours of
January 1, 2. Aspirin assessment hospital arrival
2004 prescribed at
discharge 7. ACE inhibitor 9. Oxygenation
for left assessment
3. ACE ventricular
(angiotensin- systolic 10. Pneumococcal
converting dysfunction vaccination status
enzyme) inhibitor
for left
ventricular
systolic
dysfunction
4. Beta blocker
at arrival
5. Beta blocker
prescribed at
discharge
Expanded-measure set
For 1-5 above plus 6-7 above plus 8-10 above plus (none)
discharges
beginning 11. Thrombolytic 14. Discharge 16. Blood culture
April 1, agent received instructions performed before
2004 within 30 minutes first antibiotic
of hospital 15. Adult smoking received in
arrival cessation hospital
advice/counseling
12. PTCA 17. Adult smoking
(percutaneous cessation
transluminal advice/counseling
coronary
angioplasty)
received within
90 minutes of
hospital arrival
13. Adult smoking
cessation
advice/counseling
For 1-5, 11-13 above 6-7, 14-15 above 8-10, 16-17 above 20.
discharges plus Prophylactic
beginning antibiotic
July 1, 18. Initial received
2004 antibiotic within 1
selection for CAP hour prior
(community-acquired to surgical
pneumonia) in incision
immunocompetent
patient 21.
Prophylactic
19. Influenza antibiotic
vaccinationa selection
for surgical
patientsa
22.
Prophylactic
antibiotics
discontinued
within 24
hours after
surgery end
Source: CMS, as of August 4, 2005.
Note: Measures are worded as CMS posted them on www.qnetexchange.org.
aHospitals are collecting data for these measures, but public reporting of
hospital performance on these measures has been postponed.
To minimize the data collection burden on hospitals by the APU program,
CMS and the Joint Commission on Accreditation of Healthcare Organizations
(JCAHO) have worked to align their procedures and protocols for collecting
and reporting the specific clinical information that is used to score
hospitals on the measures. JCAHO-accredited hospitals-approximately 82
percent of hospitals that participate in Medicare-have since 2002
submitted data to JCAHO on the same measures as those in the APU-measure
set as well as many of those in the expanded-measure set. Beginning with
the first calendar quarter of data submitted by hospitals for the APU
program, hospitals had the option of submitting the same data to CMS that
many of them were already collecting for JCAHO. In November 2004, CMS and
JCAHO jointly issued a manual laying out the aligned procedures and
protocols for discharges beginning January 1, 2005.
Collection, Submission, and Reporting of Quality Data
Hospitals use CMS's definition of the eligible patient population to
identify the patients for whom they should collect and submit quality data
for each measure. The definition is based on the primary diagnosis and,
for the two cardiac conditions, the age of the patient.17 Specifically,
hospitals use diagnostic codes and demographic information from the
patients' medical and administrative records to determine eligibility
based on protocols established by CMS.
Once the eligible patients have been identified, hospitals extract from
their patients' medical records the specific data items needed for the
Iowa Foundation for Medical Care (IFMC) to calculate a hospital's
performance, following detailed data abstraction guidelines developed by
CMS. Hospitals may submit data for all eligible patients for a given
condition, or if they have more than a specified number of eligible
patients, they may draw a random sample according to a formula,18 and
submit data for those patients only. These data are put into a
standardized data format and submitted quarterly through a secure Internet
connection to the QIO "clinical warehouse" administered by IFMC. IFMC
accepts into the clinical warehouse only the data that meet the formatting
and other specifications established by CMS19 and that are submitted
before the specified deadline for that quarter. About 80 percent of
hospitals rely on data vendors-which typically are collecting the same
data for JCAHO-to submit the data for them.
17Patients under 18 years of age are excluded from the eligible patient
population for the two cardiac conditions.
18Before hospitals can consider sampling, rather than submitting all of
their eligible cases, the number of eligible cases must exceed a minimum
sample size that ranges from 60 per quarter for pneumonia cases to 76 for
heart failure cases and 78 for heart attack cases. Once hospitals reach
that threshold for a given condition, they can submit a random sample of
their cases as long as the minimum sample size is met and it includes at
least 20 percent of their eligible cases, up to a maximum sample size
requirement of 241 for pneumonia, 304 for heart failure, and 311 for heart
attacks. For discharges that occurred prior to January 1, 2005, CMS
applied a different formula to hospitals not accredited by JCAHO that
called for a minimum sample size of 7 for each of the three conditions and
a sampling rate of at least 20 percent until a maximum sample size
requirement of 70 cases was reached.
IFMC aggregates the information from the individual patient records to
generate a rate for each hospital on each of the measures for which the
hospital submitted relevant clinical data. These rates show how often a
hospital provided the specific service or activity designated in the
measures to patients for whom that service or activity was appropriate.
Hospitals also collect information on each patient that identifies
patients for whom the particular service or activity would not be called
for, such as patients with a condition that would make prescribing aspirin
or beta blockers medically inappropriate.
CMS posts on its Hospital Compare Web site each hospital's rates for all
the APU and expanded measures for which it submitted data.20 In November
2004, CMS first posted these rates, based on data from the first quarter
of calendar year 2004. It subsequently posted new rates in March 2005,
based on the first two quarters of calendar year 2004 data, and again in
September and December 2005 with additional quarters of data. CMS
continues to update these rates quarterly, using the four most recent
quarters of data available. There can be up to a 14-month time lag between
when patients are treated by the hospital and when the resulting rates are
posted on the CMS Web site. (See fig. 1.)
19IFMC statistics show that a majority of hospitals ultimately succeed in
gaining acceptance for all the cases they have submitted and that less
than 10 percent of hospitals have had more than 5 percent of their cases
rejected in a given quarter.
20For two measures, influenza vaccination and prophylactic antibiotic
selection for surgical patients, CMS has postponed public reporting.
Figure 1: Approximate Times for Collection, Submission, and Reporting of
Hospital Quality Data
aCMS had to make its determination of hospital eligibility for the fiscal
year 2005 annual payment update decision approximately 1 month after
hospitals submitted their data for the first quarter.
Implementation of the APU Program
In implementing the APU program, CMS uses the same policies and procedures
for collecting and submitting quality data as are used for HQA. For the
first annual payment update determined by the APU program, which applied
to fiscal year 2005, hospitals were required to begin submitting data by
July 1, 2004, for the patients discharged during the first calendar
quarter of 2004 (January through March 2004). Data were received from
3,839 hospitals, over 98 percent of those affected by the MMA provision.
These figures include 150 hospitals that certified to CMS that they had no
eligible patients with the three conditions during the first calendar
quarter of 2004. Hospitals that have no eligible patients are not
penalized and receive the full annual payment update. For the second
annual payment update determined by the APU program, which applied to
fiscal year 2006, participating hospitals were required to continue to
submit data in accordance with the quarterly deadlines set by CMS. Failure
to meet the requirements of the program and qualify for the full annual
payment update in one year does not affect a hospital's ability to
participate in and qualify for the full update in the succeeding year.
CMS has assigned primary responsibility to the 53 QIOs to inform hospitals
about the APU program's requirements and to provide technical assistance
to hospitals in meeting those requirements. This includes assistance to
hospitals in submitting their data to the clinical warehouse provided by
IFMC.
Other Reporting Systems
There are several organizations that administer reporting systems that
collect clinical data, some of which also release their data to the
public. Some of these organizations are in the public sector, such as
state health departments, and some are in the private sector, such as
accreditation bodies. Several of these systems have been in existence for
a number of years, including one for as long as 16 years. Hospitals,
health plans, nursing homes, and other external organizations submit data
to these systems on a range of medical conditions, which for most of these
systems includes at least one cardiac condition (e.g., percutaneous
coronary intervention, coronary artery bypass grafting, heart attack,
heart failure). Many of these systems make the results of the data they
have collected available for public use. For example, one public
organization has been collecting individual, patient-level data on cardiac
surgeries from hospitals for the past 16 years and creates reports based
on the data collected, which it subsequently posts on its Web site.
Additionally, data collected by these reporting systems can also be used
for quality improvement efforts and to track performance over time. (For
more background information on other reporting systems, see app. II, table
3.)
CMS Has Processes for Checking Data Accuracy but Has No Ongoing Process to Check
Completeness
CMS has processes for ensuring the accuracy of the quality data submitted
by hospitals for the APU program, but has no ongoing process to assess
whether hospitals are submitting complete data. To check accuracy, IFMC, a
CMS contractor, electronically checks the data as they are submitted to
the clinical warehouse. In addition, CDAC independently audits the data
submitted by hospitals. Specifically, it reabstracts the quality data from
medical records for a sample of five patients per quarter for each
hospital and compares its results to the quality data submitted by
hospitals. The data are deemed to be accurate if there is 80 percent or
greater agreement between these two sets of results, a standard that
hospitals had to meet for the APU-measure set to qualify for their full
annual payment update for fiscal year 2006. To check completeness, CMS has
twice compared the number of cases submitted by each hospital for the APU
program for a given period to the number of claims the hospital submitted
to Medicare, once for the fiscal year 2005 update and once for the fiscal
year 2006 update. However, these analyses did not address non-Medicare
patient records and the approach that CMS took in these analyses was not
capable of detecting incomplete data for all hospitals. CMS has not put in
place an ongoing process for checking the completeness of the data that
hospitals submit for the APU program that would provide accurate and
consistent information for all patients and all hospitals. Moreover, CMS
has not required hospitals to certify that they submitted data for all
eligible patients or a representative sample thereof.
CMS Checks Data Accuracy Electronically and Through an Independent Audit
CMS employs two processes to check and ensure the accuracy of the quality
data submitted by hospitals for the APU program. First, at the time that
data are submitted to the clinical warehouse, IFMC, a CMS contractor,
electronically checks the data for inconsistencies and missing values. The
results are shared with hospitals. After the allotted time for review and
correction of the submissions, no more data or corrections may be
submitted by hospitals for that quarter. These checks are done whether the
hospital submits its data directly to the warehouse or through a data
vendor.
Second, CDAC conducts quarterly independent audits to verify that the data
submitted by hospitals to the clinical warehouse accurately reflect the
information in their patients' medical records.21 From among all the
patient records submitted to the clinical warehouse each quarter, CMS
randomly selects for CDAC's reabstraction five patient records from each
participating hospital.22 CDAC sends a request for these patients' medical
records to the hospitals, and they send photocopies of the records to CDAC
for reabstraction. A CDAC abstractor reviews the medical record,
determines if or when a specific action occurred-such as the time when a
patient arrived at the hospital-and records that data field accordingly.
Once the CDAC reabstraction is complete, the response previously entered
into that field by the hospital is compared to that entered by the CDAC
abstractor, and CDAC notes whether the two responses match. If they do not
match, a second CDAC abstractor reviews the medical record to make a final
determination. The results of the CDAC reabstraction are sent to the
clinical warehouse, where the individual data matches and mismatches are
summed to produce an accuracy score for each hospital. The accuracy score
represents the overall percentage of agreement between data submitted by
the hospital and data reabstracted by CDAC across all five cases.23 It is
based on all the APU and expanded measures for which the hospital
submitted data.24 The score, along with information from CDAC on where the
mismatches occurred and why, is shared with the hospital and the
hospital's local QIO. CMS considers hospitals achieving an accuracy score
of 80 percent or better to have provided accurate data. Hospitals with
accuracy scores below 80 have the opportunity to appeal their
reabstraction results.25
21DynKePRO, LLC, has operated CDAC since 1994. For 10 years it shared this
function with a second firm, but in September 2004 DynKePRO negotiated a
new contract with CMS that made it the sole CDAC contractor. In April
2005, DynKePRO became CSC York.
22To be included in the reabstraction process, hospitals must have
submitted data on at least six patients across all three conditions in
that quarter.
In applying these processes for the fiscal year 2005 annual payment
update, CMS did not require hospitals to meet the 80 percent accuracy
threshold for the 10 APU measures to qualify for the full update. Rather,
to receive their full payment update, hospitals only had to pass the
electronic data checking performed when they submitted their data to the
clinical warehouse for the first calendar quarter of the APU program-for
discharges that occurred from January 2004 through March 2004. Although
the accuracy scores were not considered for the payment update, CMS
calculated an accuracy score for each quarter in which the hospital
submitted at least six cases to the clinical warehouse. Each quarter the
accuracy score was based on data for all the measures submitted by the
hospital in that quarter and was derived from five randomly selected
patient records. Along with the accuracy score, hospitals received
information on where mismatches occurred and the reasons for the
mismatches.
23The accuracy score is not based on all the data submitted by a hospital.
Rather, CMS has identified a specific subset of the data elements that
should be counted in computing the accuracy score. In general, CMS
included in this subset the clinical data elements needed to calculate the
hospital's rate for each of the measures and left out other administrative
and demographic information about the patients. CMS estimates that five
patient records usually contain about 100 data elements for calculation of
the accuracy score, but the actual number of data elements depends on
which conditions were involved and the number of measures for which a
hospital submitted data.
24Although CMS computes accuracy scores based on data for all measures
submitted to the clinical warehouse, it recognizes that the MMA provision
affecting hospital payments applies only to data for the 10 measures
specified for the APU program. See 69 Fed. Reg. 49080 (Aug. 11, 2004).
25CMS created an appeal process that allows a hospital to challenge the
reabstraction results through its local QIO. For data from the first two
calendar quarters of 2004, if the QIO agreed with the hospital's
interpretation, the appeal was forwarded to CDAC for review and
correction, if appropriate. CDAC's decision on the appeal was final.
Beginning with data from the third calendar quarter of 2004, appealed
cases no longer go back to CDAC. Instead, QIOs make the final decision to
uphold either CDAC's or the hospital's interpretation. During this
process, hospitals are not allowed to supplement the submitted patient
medical records.
In contrast to the prior year, CMS applied the 80 percent threshold for
accuracy as a requirement for hospitals to qualify for their full fiscal
year 2006 annual payment update.26 IFMC continued to check electronically
all of the data as they were submitted for each quarter and calculated
accuracy scores quarterly for each hospital. CMS decided to base its
payment update decision on the accuracy score that hospitals obtained for
the third calendar quarter of 2004-for discharges that occurred from July
2004 through September 2004.27 This meant that the payment decision rested
on the reabstraction results obtained from 5 randomly selected patient
records. If a hospital met the 80 percent accuracy threshold based on all
of the quality data it submitted, it received the full payment update.
However, if a hospital failed to meet the 80 percent threshold, CMS
recomputed the accuracy score using only the data elements required for
the APU-measure set. For hospitals that failed again, CMS combined the
CDAC reabstraction results from the third calendar quarter of 2004 with
the CDAC results from the fourth calendar quarter of 2004 to produce an
accuracy score derived from 10 patient medical records.28 CMS then
computed accuracy scores first for all the quality data submitted by the
hospital and finally for the APU-measure set, if needed to reach the 80
percent threshold. As a result, even though CMS assessed hospital accuracy
primarily on the basis of data that exceeded those required for the
APU-measure set, hospitals were not denied the full annual payment update
except on the basis of the APU-measure set. A possibility does exist,
however, that a hospital could have qualified for the full update based on
its results for all the data it submitted, even if it would have failed
using the APU-measure set. This could happen if the hospital submitted
data that matched the CDAC abstractors' entries more consistently for the
data entries used exclusively in computing the expanded measures, such as
those relating to smoking cessation counseling, than for the data required
by the APU-measure set.
2670 Fed. Reg. 47420-47428 (Aug. 12, 2005).
27CMS decided not to use accuracy scores from the first two quarters of
the APU program because those data were collected before the alignment of
CMS and JCAHO data collection specifications had begun to come into
effect. Given the time needed to conduct all the steps in the process (see
fig. 1), CMS was left with the third calendar quarter of 2004 as the
latest full quarter of data that could be used for determining the fiscal
year 2006 update. The third calendar quarter also marked HQA's expansion
to 22 measures.
28Hospitals had to submit their patient medical records to CDAC for the
fourth calendar quarter 2004 reabstractions no later than August 1, 2005,
to take advantage of this additional opportunity to pass the 80 percent
threshold.
In the future, CMS intends to base its decisions on hospital eligibility
for full annual payment updates on accuracy assessments from more than one
quarter. Although its concerns about potential alignment issues affecting
data for the first two quarters of the APU program led the agency to rely
primarily on data from the third calendar quarter for the fiscal year 2006
update, CMS stated that its goal was to use accuracy assessments from four
consecutive quarters when it determines hospital eligibility for the
fiscal year 2007 full annual payment update.
CMS uses the accuracy scores in making decisions on payment updates, but
the scores do not affect the information posted on the Hospital Compare
Web site. The Web site transmits to the public the rates on the APU and
expanded measures that derive from the data that the hospitals submitted
to the clinical warehouse. CMS does not post the accuracy scores generated
from the CDAC reabstraction process on the Web site or indicate if the
hospital rates are based on data that met CMS's 80 percent threshold for
accuracy.29
CMS Has No Ongoing Process to Ensure Completeness of Data Submitted for the APU
Program
Although CMS has recognized the importance of obtaining quality data for
the APU program on all eligible patients, or a representative sample if
appropriate, it has not put in place an ongoing process to ensure that
this occurs. For the fiscal year 2005 annual payment update, CMS checked
that hospitals submitted data for at least a minimum number of patients by
using Medicare claims data to estimate the number of "expected cases" that
each hospital should have submitted to the clinical warehouse. To do this,
it first calculated the average number of patients for each of the three
conditions that each hospital had billed Medicare for over the previous
eight calendar quarters (January 2002 through December 2003). Then, if the
average number of Medicare claims for a condition was large enough to
entitle the hospital to draw a sample instead of submitting data for all
the eligible patients to the clinical warehouse, CMS reduced the number of
"expected cases" based on the size of the sample.30 CMS told each hospital
what its expected numbers of heart attack, heart failure, and pneumonia
patients were. If the actual number of patients for whom hospitals
submitted data for the APU program was lower, the hospitals were
instructed to send a letter to their local QIO, signed by the hospital's
CEO or administrator, stating that the hospital had fewer discharged
patients for that condition than CMS had estimated. If such a letter was
filed, the hospital qualified for the full annual payment update. In the
end, no hospital participating in the APU program was denied a full annual
payment update for fiscal year 2005 for submitting data on an insufficient
number of patients or any other reason.
29The Hospital Compare Web site identifies instances where rates for a
measure were based on fewer than 25 cases and where data were suppressed
due to inaccuracies. However, the latter indication reflects situations
where a hospital had problems with transmission of its data by a data
vendor, not the outcome of the CDAC reabstractions.
For the fiscal year 2006 update decision, CMS took a different approach to
using Medicare claims data to address the issue of completeness. CMS used
Medicare claims data to check whether hospitals that billed Medicare for
any cases with one of the three conditions submitted at least one case to
the clinical warehouse. To do this, CMS compared each hospital's Medicare
claims for the three conditions for the four calendar quarters of 2004 to
the hospital's submissions to the clinical warehouse for those same
quarters. CMS identified instances where hospitals had submitted one or
more claims for payment to Medicare for any of the three conditions for a
quarter when they had not submitted any cases with one of those conditions
to the clinical warehouse. On this basis, CMS determined that 110
hospitals would not qualify for the full payment update for fiscal year
2006.
CMS conducted two additional analyses involving a comparison of the same
Medicare claims data and quality data submissions to identify hospitals
that may have submitted incomplete data for the APU program, but these
analyses did not affect hospital eligibility for the full fiscal year 2006
payment update. The additional analyses identified (1) a set of hospitals
that may have submitted samples of their eligible cases to the clinical
warehouse when, according to the applicable sampling rules, they should
have submitted data on all their cases; and (2) another set of hospitals
that failed to submit cases to the clinical warehouse for all of the three
conditions for which they filed Medicare claims in that quarter. However,
in contrast to the hospitals that did not qualify for their full payment
update, the hospitals in the second set submitted to the clinical
warehouse at least one case for one of the three conditions. A CMS
official stated that the agency plans to educate the hospitals identified
by these additional analyses on the data submission and sampling
requirements for the APU program.
30Originally, CMS intended to apply JCAHO's sampling rules to
JCAHO-accredited hospitals, and its own sampling rules to the other
hospitals, in computing their "expected cases". JCAHO's sampling
procedures called for submitting larger samples to the clinical warehouse
than CMS's did. However, when CMS officials determined that they could not
reliably identify every hospital that belonged in the JCAHO group, they
decided to apply the CMS rules across the board to all hospitals.
Therefore, for many JCAHO-accredited hospitals, the number of "expected
cases" computed by CMS underestimated the number of Medicare cases for
which these hospitals should have submitted data, because JCAHO-accredited
hospitals were to submit cases according to the JCAHO sampling rules.
The analysis that CMS conducted using Medicare claims data for its fiscal
year 2005 update decision and the three analyses it conducted in
conjunction with its fiscal year 2006 update decision shared two
limitations: none addressed the completeness of data submissions for
non-Medicare patients, and none could detect incomplete data for all
hospitals. Given that non-Medicare patients represent a substantial
proportion of the patients treated for heart attacks, heart failure, and
pneumonia,31 any minimum number of "expected cases" based on Medicare
claims inherently underestimates the total number of patients for which
hospitals should have submitted quality data for the APU program.
Moreover, the approaches taken in the analyses conducted for both fiscal
year updates could not detect incomplete data for many hospitals. For
example, in the fiscal year 2005 analysis, the difference between the
number of cases expected under the CMS sampling rules and the higher
number expected under the sampling rules that applied to JCAHO-accredited
hospitals meant that JCAHO-accredited hospitals treating more patients
than the minimum CMS sample of seven could have failed to submit data on
most of the cases that exceeded the CMS minimum and still have met the
number of expected cases set by CMS.32 The analysis that CMS conducted to
determine hospital eligibility for the full fiscal year 2006 update also
could identify only certain hospitals that submitted incomplete data, in
this case limited to hospitals that submitted no patient data at all to
the clinical warehouse in a given quarter.
31Non-Medicare patients account for about 40 to 50 percent of all patients
hospitalized for heart attacks and pneumonia and 20 to 32 percent of those
hospitalized for heart failure. For individual hospitals, these
percentages could be higher or lower.
32See appendix I for more detailed information on the limitations that
applied to CMS's effort to estimate a minimum number of expected cases for
each hospital.
CMS officials acknowledged that the lack of information on non-Medicare
patients and the imprecise adjustments that CMS made to take account of
the varying sampling procedures that hospitals could have followed limited
the conclusions that CMS could draw from its Medicare claims data analysis
for the fiscal year 2005 update. Because of these limitations, CMS
officials described their effort as a rough check for inconsistencies
between data submitted by hospitals to the clinical warehouse and the
cases that the hospitals had billed to Medicare.
CMS has not combined these limited efforts to monitor the completeness of
hospital quality data submissions with efforts to clearly inform hospital
officials of their obligation to submit complete data. For example, CMS
has not explicitly listed submission of complete data as a requirement for
participating in the APU program on the "Notice of Participation" that the
hospital CEO or administrator must sign when hospitals enroll. The notice
states requirements for participating hospitals-including that they must
register with the QualityNet Exchange Web site33 and that they must submit
data for all measures specified in the APU-measure set by established
deadlines. The notice indicates that the submitted data will undergo
validation, a reference to the CDAC reabstraction process. However, the
notice does not stipulate that hospitals must submit data for all eligible
cases, or for a representative sample if appropriate.
We interviewed health professionals familiar with the APU program, several
of whom raised concerns about data completeness. One expert in the area of
outcomes research noted the potential for systematic underreporting by
hospitals. He suggested that, as one approach to detect systematic
underreporting, CMS could compare not only the number of patients for whom
data were submitted and Medicare claims filed, but also the
characteristics of patients for cases submitted to the APU program to the
patient characteristics of comparable cases submitted to Medicare for
payment. Another expert in the area of clinical quality improvement
expressed his concern that the APU program did not verify the completeness
of the data. He observed that hospitals have flexibility in determining
which patients are included through their assignment of the patient's
primary diagnosis. A QIO official echoed this concern, noting the risk
that hospitals could decide to not submit cases where patients had not
received the services or activities assessed by the APU measures.
33The QualityNet Exchange Web site is the secure Internet connection used
to transmit hospital quality data to the clinical warehouse.
Data Accuracy Baseline Was High Overall, but Statistically Uncertain for Many
Hospitals, and Data Completeness Baseline Cannot Be Determined
We could determine a baseline level of accuracy for the quality data
submitted for the APU program but not a baseline level of completeness. We
found a high overall baseline level of accuracy when we examined CMS's
assessment of the data submitted by hospitals for the first two calendar
quarters of 2004. The median accuracy score exceeded 90 percent, which was
well above the 80 percent accuracy threshold set by CMS, and about 90
percent of hospitals met or exceeded that threshold for both the first and
the second calendar quarters of 2004. For most hospitals whose accuracy
scores were well above the threshold, the results were statistically
certain. However, for approximately one-fourth to one-third of all the
hospitals that CMS assessed for accuracy, the statistical margin of error
for their accuracy score included both passing and failing accuracy
levels. Consequently, for these hospitals, the small number of cases that
CMS examined was not sufficient to establish with statistical certainty
whether the hospital met the threshold level of data accuracy. Accuracy
did not vary between rural and urban hospitals, and small hospitals
provided data as accurate as those from larger hospitals. The completeness
baseline could not be determined because CMS did not assess the extent to
which all hospitals submitted data on all eligible patients, or a
representative sample thereof, for the first two calendar quarters of
2004, and consequently there were no data from which to derive such an
assessment.
Baseline Level of Data Accuracy Was High Overall, and Large Majority of
Hospitals Met Accuracy Threshold
Overall, the baseline level of data accuracy for the first two quarters of
the APU program was high. The median accuracy score achieved by hospitals
ranged between 90 and 94 percent, with slightly higher values in the
second quarter and for the APU-measure set. (See fig. 2.) In addition,
with at least half the hospitals receiving accuracy scores above 90,
relatively few failed to reach the 80 percent threshold set by CMS.
Figure 2: Baseline Hospital Accuracy Scores at Selected Percentiles, by
Measure Set and Quarter
Note: Figure reflects accuracy scores for hospitals covered by the APU
program. Hospitals that submitted fewer than six cases to the clinical
warehouse in a quarter did not undergo CDAC reabstraction and therefore
did not receive an accuracy score for that quarter. Calculation of
accuracy scores for the expanded-measure set was based on all the measures
for which a hospital submitted data, which could range from the APU
measures alone to a maximum of 17-the APU measures plus as many as 7
additional measures.
In both quarters, 90 to 92 percent of hospitals obtained accuracy scores
meeting the threshold using the APU-measure set, and 87 to 90 percent met
the threshold using the expanded-measure set (see table 2).34 The 8 to 13
percent of hospitals that did not meet the accuracy threshold represented
approximately 300 to 500 hospitals across the country.
Table 2: Percentage and Number of Hospitals Whose Baseline Accuracy Score
Met or Fell Below the 80 Percent Threshold, by Measure Set and Quarter
APU-measure set Expanded-measure set
January-March April-June 2004 January-March April-June 2004
2004 discharges discharges 2004 discharges discharges
Percentage Number Percentage Number Percentage Number Percentage Number
Hospitals
whose
accuracy
score met
80
percent
threshold 90.2 3,290 91.8 3,282 86.8 3,165 90.0 3,217
Hospitals
whose
accuracy
score
fell
below 80
percent
threshold 9.8 359 8.2 292 13.2 483 10.0 357
Total 100 3,649 100 3,574 100 3,648 100 3,574
Source: GAO analysis of CMS data.
Note: Calculation of accuracy scores for the expanded-measure set was
based on all the measures for which a hospital submitted data, which could
range from the APU measures alone to a maximum of 17-the APU measures plus
as many as 7 additional measures.
There were minimal differences in baseline accuracy scores among hospitals
characterized by urban or rural location and small or large capacity,35
but variation across hospitals served by different data vendors was more
substantial. Rural hospitals and smaller hospitals generally received
accuracy scores similar to those of urban hospitals and larger
hospitals.36 Among the hospitals that used JCAHO-certified data vendors to
submit their quality data to the clinical warehouse, a higher percentage
of hospitals served by certain data vendors met the 80 percent threshold
than did the hospitals served by other data vendors (see app. III, table
8).37
34For our analysis of baseline accuracy, the expanded-measure set includes
the seven additional quality measures beyond the APU-measure set that HQA
adopted for discharges after March 31, 2004. We found that some hospitals
submitted data on the additional measures to the clinical warehouse for
discharges occurring before that date, possibly because the hospitals were
already collecting those data for JCAHO.
35We assessed hospital capacity in terms of the number of patient beds.
36For more detailed information on the relation of data accuracy to
hospital characteristics and use of data vendors, see the tables in
appendix III.
Passing the 80 Percent Threshold Is Statistically Uncertain for One-Fourth to
One-Third of Hospitals
While the baseline level of data accuracy achieved by hospitals in the
aggregate was well above the 80 percent threshold, for approximately
one-fourth to one-third of hospitals the determination that a particular
hospital met the 80 percent threshold was statistically uncertain. This
uncertainty stems primarily from the small number of cases examined for
accuracy from each hospital. Because CDAC's reabstraction of the data is
limited to five patient records per quarter, the greater sampling
variability found in small samples leads to relatively large confidence
intervals, reflecting low statistical precision, for the accuracy score of
any specific hospital.38 Across all hospitals, the median difference
between the upper and lower limits of the confidence interval was 14.0
percentage points using the APU-measure set for first-quarter discharges,
dropping to 11.8 percentage points in the second quarter.39 For the
expanded-measure set, the median confidence interval was 14.6 percentage
points in the first quarter and 13.0 percentage points in the second.
37The data that we obtained from CMS specifically identified data vendors
that JCAHO had certified for its own performance reporting system. These
data vendors submitted data to the clinical warehouse for 78 to 79 percent
of the hospitals we analyzed for the two baseline quarters, while another
13 to 14 percent of hospitals directly submitted their own data.
38Statistical uncertainty occurs because different samples generally
produce different results, due to variation among the individual patients
selected for different samples. With larger samples, differences in the
results obtained from one sample to another decrease. Calculating a
confidence interval provides a way to assess the effect of sample
variation on the results obtained. Confidence intervals are usually
computed at the 95 percent level. So if 100 samples were selected, the
result produced by 95 of them would likely fall between the low and high
ends of the confidence interval. For example, one 300-plus-bed hospital in
Virginia had an accuracy score of 83.3 for the second calendar quarter of
2004 using the expanded-measure set, with a confidence interval that
ranged from 76.8 to 89.9. There is a 95 percent likelihood that any sample
selected for that hospital would generate an accuracy score that was
greater than 76 and lower than 90.
39The formula used to generate these confidence intervals takes into
account variation in the number of individual data elements that were
available in the five selected cases to compare the hospital's and CDAC's
results. This is the same formula that is used by CMS, with one
modification. Whereas CMS applied a one-tailed test at a 95 percent
significance level to protect against hospitals receiving a failing score
due to sampling error, we applied a two-tailed test at the 95 percent
significance level to identify both failing and passing scores that were
statistically uncertain. (See app. I.)
The wide confidence intervals meant that for a substantial number of
hospitals it was statistically uncertain whether a different sample of
cases would have altered their result from passing the 80 percent
threshold to failing, or vice versa.40 For most hospitals there was
statistical certainty that their baseline accuracy score met CMS's 80
percent accuracy threshold. However, other hospitals had confidence
intervals for their accuracy scores where the upper limit was 80 or above
and the lower limit was less than 80. Because the confidence interval
around the accuracy score computed for each of these hospitals bracketed
the accuracy threshold set by CMS, their results were statistically
uncertain.41 Consequently, for these hospitals, the small number of cases
that CMS examined was not sufficient to establish whether the hospital met
the threshold level for data accuracy. One-third of all the hospitals that
CMS assessed for accuracy fell into this uncertain category for
first-quarter 2004 discharges using the APU-measure set. (See fig. 3.)
This proportion declined to about one-fourth of the hospitals for the
second quarter. When the expanded-measure set was used-as CMS has done
when calculating its quarterly accuracy scores-the proportion of hospitals
whose accuracy scores were statistically uncertain increased compared to
the APU-measure set for both the first and the second quarter.
40Most, but not all, of the hospitals with statistically uncertain results
had accuracy scores of 80 or above. See table 10 in appendix III.
41For example, if a hospital had a confidence interval that ranged from 77
to 90, taking multiple samples would lead to some samples generating
accuracy scores at or above 80 and other samples generating scores of less
than 80. Whether that hospital passed the 80 percent accuracy threshold
would depend on which of those samples was actually selected.
Figure 3: Percentage of Hospitals Whose Baseline Accuracy Score Confidence
Intervals Clearly Exceed, Fall Below, or Include the 80 Percent Threshold,
by Measure Set and Quarter
Note: The confidence interval is based on a 95 percent significance level.
Calculation of the accuracy scores and confidence intervals for the
expanded-measure set was based on all the measures for which a hospital
submitted data, which could range from the APU measures alone to a maximum
of 17-the APU measures plus as many as 7 additional measures.
These confidence intervals would narrow if CMS drew on multiple quarters
of data to bring more cases into the computation of the accuracy scores.
CMS has stated its intention to base this accuracy assessment on four
quarters of hospital quality data, but so far every accuracy score it has
generated and reported to hospitals has been based on a single quarter of
data. Moreover, its implementation of the fiscal year 2006 payment update
called for using only one quarter of data, with the possibility of adding
one more quarter of data for hospitals that failed to meet the accuracy
threshold based on the single quarter of data.42
42See 70 Fed. Reg. at 47422.
No Data Were Available to Provide Baseline Assessment of Completeness of
Hospital Quality Data
There were no data available from which to estimate a baseline level of
completeness for the first two calendar quarters of data submitted for the
APU program. In contrast to the system of quarterly reabstractions
performed by CDAC to check the accuracy of quality data submitted by
hospitals, CMS did not conduct any corresponding assessment of the extent
to which all hospitals submitted data on all the cases, or a
representative sample of such cases, that met CMS's eligibility criteria
for the first two calendar quarters of 2004.
The information that CMS did collect was not suitable for estimating the
baseline level of data completeness. The Medicare claims data analysis
conducted by CMS on the first calendar quarter of data submitted for the
APU program was not designed to provide valid information on the magnitude
of data incompleteness for each hospital, which is what is needed to
estimate a baseline level of data completeness. Although CMS could
identify instances where certain hospitals failed to provide quality data
on all eligible cases, CMS's analysis did not produce comparable
information on data completeness for every hospital. As noted above, it
lacked information on non-Medicare patients and could not adjust properly
for the sample sizes that JCAHO-accredited hospitals would have drawn if
they followed JCAHO's sampling rules rather than CMS's. The limitations in
the CMS analysis would affect some hospitals more than others, depending
on how many non-Medicare patients a hospital treated and whether it
applied the JCAHO sampling rules. Consequently, had we used information
from this analysis to estimate baseline data completeness, our results
would have been distorted by the uneven impact of those factors on the
information produced for different hospitals.43
In addition, we found no data for assessing the baseline completeness of
the quality data provided by hospitals submitting samples of their
eligible cases to the clinical warehouse. For hospitals that submitted a
sample, their quality data could be incomplete, even if they submitted the
expected number of cases, if their samples were not selected in a way that
ensured they were representative of all a hospital's patients. If a
hospital did not follow appropriate procedures to provide for random
selection, the sample might not be representative and therefore could be
incomplete. Because the available information from CMS focused on the
number of cases submitted, and not on how they were selected, we could not
address this aspect of data completeness.
43See appendix I for a more detailed description of this assessment.
Other Reporting Systems Use Various Methods to Ensure Data Accuracy and
Completeness, Notably an Independent Audit
Other reporting systems that collect clinical performance data have
adopted various methods to ensure data accuracy and completeness, and
officials from these systems stressed the importance of including an
independent audit in these activities. Most other reporting systems that
conduct independent audits incorporate three methods as part of their
process that CMS does not use in its independent audit. Specifically,
these systems include an on-site visit, focus their audit on a selected
number of facilities or reporting entities, and review a minimum of 50
patient medical records per reporting entity.
Other Reporting Systems Use Various Methods to Check Data
Other reporting systems that collect clinical performance data have
adopted various methods to ensure data accuracy and completeness. To check
data accuracy, all the other reporting systems we examined assess the data
when they are submitted, typically using computers to detect missing or
out-of-range data. (See app. II, tables 4 and 5.) In addition, all the
other systems have developed standardized data collection processes and
measures. When checking data completeness, all the other systems compare
submitted data with data from another source, whether inside the facility,
such as pharmacy or laboratory records, or outside the facility, such as
state hospital discharge data or Medicare claims data. Officials reported
that these analyses were done annually or had been done one time, and one
said that additional studies were planned.44 Officials from these systems
also cite various other methods to consider when ensuring data accuracy
and completeness, including reviewing established measures annually,
identifying a point person at each facility to provide consistency,
establishing channels for ongoing communication, and providing training on
a continuous basis.45
44For example, on-site auditors from one reporting system compare the data
submitted against catheterization laboratory schedules and hospital
billing records for the previous 12 months. Another reporting system hired
a contractor to perform a one-time study comparing patient assessment data
submitted by a facility against its total Medicare claims to identify
instances where patient assessments were missing.
Other Reporting Systems Conduct Independent Audits
Most other reporting system officials we interviewed conduct independent
audits that include a comparison of submitted data to medical records.
Most other reporting systems that conduct independent audits incorporate
three methods as part of their process that CMS does not use in its
independent audit. Specifically, they (1) include an on-site visit as part
of their independent audit, (2) focus their audits on a selected number of
facilities or reporting entities, and (3) review a minimum of 50 patient
medical records per reporting entity during the auditing process. During
an on-site visit, auditors are able to review patient medical records for
accuracy and interview staff when additional information is needed.
Auditors are also able to check the data submitted to their system against
other data sources at the facilities, including physician notes, patient
or resident rosters, billing records, laboratory records, and pharmacy
records. In addition, because auditors from other reporting systems may
not visit every facility,46 the systems use various methods to focus the
auditing process when selecting which facilities to visit. These include
auditing a percentage of all eligible facilities, auditing facilities that
did particularly well or poorly, and auditing a subset of facilities each
year. Furthermore, most of the other reporting systems that conduct
independent audits review a minimum of 50 patient medical records per
audited entity as part of their independent auditing process. When
selecting which patient medical records to review, some systems take a
random sample of the patient population, one system reviews all deaths at
the selected facility, and another reviews all instances where the patient
died from shock as a result of percutaneous coronary intervention.
45We have also published a document that describes a flexible framework
for assessing data reliability, including both accuracy and completeness,
when assessing computer-processed data. This document offers procedures
that can be adapted to varying circumstances. These procedures include
conducting electronic data testing, such as logic tests; ensuring internal
control systems are in place that check the data when they are entered
into the system and limit access to the system; checking for missing data
elements as well as missing case records; and reviewing related
documentation, which may include tracing a sample of records large enough
to estimate an error rate back to their source documents. See GAO,
Assessing the Reliability of Computer-Processed Data, GAO-03-273G
(Washington, D.C.: October 2002) External Version 1.
46An official from one reporting system said that budgetary constraints
limit the number of on-site audits that the system can perform. As a
result, auditors from that system focus their review on hospitals with
outcomes that fall above and below the systemwide average.
Officials at other reporting systems we interviewed and an expert in the
field stressed the importance of the independent audit. For example, an
official from one of the other reporting systems said that audits
conducted by an independent third party are "the best way" to ensure data
accuracy and completeness. An official from another reporting system said
that having someone independently check the data is "one of the most
important things" that an organization can do to check data accuracy and
completeness. Additionally, an expert we interviewed said that
independent, external audits are "essential." Though most of the other
reporting systems employ an independent auditing process, officials from
one system that has yet to implement such a process said their
organization recognizes the importance of independently checking the data
and is currently designing and implementing an independent auditing
process.
Conclusions
Data collected for the APU program affect the payment received by
hospitals from Medicare and are used to inform the public about hospital
quality. For both these purposes, it is important that CMS is able to
ensure that the data are reliable in terms of both accuracy and
completeness.
CMS has put in place an ongoing process for assessing the accuracy of
quality data submitted by hospitals, but the process has limitations.
Although CMS checks the accuracy of data electronically as they are
submitted and through an independent audit conducted by CDAC, the latter
process is limited by the selection of only five cases per quarter per
hospital, regardless of the hospital's size. Most hospitals had high
baseline accuracy scores that were statistically certain. However, for
about one-fourth to one-third of all the hospitals that CMS assessed for
the first two calendar quarters of 2004, CMS's determination as to whether
the hospital met its accuracy standard was statistically uncertain. This
was due primarily to the small number of cases selected for an audit.
Although CMS has stated its intention to look at more cases by pooling
reabstraction results from more than one calendar quarter, all of the
hospital accuracy reports that it has generated to date have been based on
a single quarter of data. Officials from other reporting systems that
collect clinical performance data told us that they also use an
independent audit to check data accuracy, but generally sample a larger
number of patient medical records, either by sampling a percentage of
total cases submitted or by identifying a minimum number of cases in the
sample. In addition, most other reporting systems focused their audits on
a selected number of facilities.
In contrast to CMS's establishment of an ongoing process for assessing
data accuracy, the agency has not put in place an ongoing process to check
the completeness of the data that hospitals submit. Because of the
purposes for which these data may be used, there could be an incentive for
hospitals to selectively report data on cases that score well on the
quality measures. With no ongoing way to check completeness, CMS does not
know whether or how often hospitals submit incomplete data. We believe
this is a significant gap in oversight. The process used for the fiscal
year 2005 annual payment update compared hospital submissions to Medicare
claims data, but as CMS has noted, this did not provide a comparable
assessment of each hospital's data, even for Medicare patients alone.
Moreover, in its comparison of hospital quality data submissions with
Medicare claims for the fiscal year 2006 update, CMS identified more than
100 hospitals that had treated eligible patients in a given quarter but
had not submitted data on a single case for that quarter to the clinical
warehouse. Yet CMS has not asked hospitals to certify that the data they
have submitted constitute all, or a representative sample, of the eligible
patient population. The various methods used by other reporting systems to
check the completeness of data illustrate the variety of approaches that
are available. These include conducting on-site visits as part of their
independent audit, comparing data submissions to data from another source
maintained by the facility or external to it, and performing such checks
annually or planned at specified intervals.
Given CMS's plans to continue public reporting efforts after the APU
program ends, we believe that processes for checking the reliability of
data should continue to be refined in order for the individuals and
organizations that use the data to have confidence in the information.
Recommendations for Executive Action
In order for CMS to help ensure the reliability of the quality data it
uses to produce information on hospital performance, we recommend that the
CMS Administrator undertake the following three actions:
o focusing on the subset of hospitals for which it is
statistically uncertain if they met CMS's accuracy threshold in
one or more previous quarters, increase the number of patient
records reabstracted by CDAC in a subsequent quarter so that the
proportion of hospitals with statistically uncertain results is
reduced;
o require hospitals to certify that they took steps to ensure
that they submitted data on all eligible patients, or a
representative sample thereof; and
o assess the level of incomplete data submitted by hospitals for
the APU program to determine the magnitude of underreporting, if
any, in order to refine how completeness assessments may be done
in future reporting efforts.
In commenting on a draft of this report, CMS stated it appreciated
our analysis and recommendations. (CMS's comments appear in app.
IV.) The agency noted that the APU program led to a dramatic
increase in the number of hospitals that submitted data on the
designated 10 quality measures, resulting in public reporting of
quality data for about 3,600 hospitals on the agency's Web site.
In addition, CMS described the steps it had taken to ensure the
accuracy and completeness of the quality data submitted by
hospitals for the APU program. It said that the methods it had
used were sound, but it agreed that the quality and completeness
of the data must be improved.
With respect to reducing the statistical uncertainty of its
assessments of the accuracy of hospital quality data submissions,
CMS agreed that the quarterly accuracy assessments based on five
patient charts can have considerable sampling error and stated
that it would improve the stability of its accuracy assessments by
using data from four calendar quarters when it assessed hospital
eligibility for the fiscal year 2007 annual payment update. CMS
stated a concern with having sufficient time within the current
data submission schedule to increase the number of patient records
reabstracted. However, we recommended in the draft report that
hospitals with statistically uncertain results in one or more
previous quarters have an increased number of records
reabstracted. The assessment of statistical uncertainty for a
hospital and the reabstraction of additional records do not need
to occur within the same quarter. We have modified slightly the
wording of the recommendation to clarify the intended timing of
these additional reabstractions.
With respect to ensuring the completeness of quality data
submitted by hospitals, CMS agreed that it needs to improve its
methods. CMS noted that its comparison of hospital data quality
submissions to the claims filed by those hospitals to be paid for
treating Medicare beneficiaries uncovered numerous discrepancies.
The agency agreed with our recommendation to require hospitals to
formally attest to the completeness of the quality data that they
submit quarterly. In addition, CMS stated that it would also
require each hospital to report the total number of Medicare and
non-Medicare patients who were eligible for quality assessment
under the APU program.
In terms of assessing the level of incomplete data for the APU
program, CMS said it had a process in place to accomplish this,
but as we stated in the draft report, CMS's process did not cover
all patients and all hospitals because it lacked information on
non-Medicare patients even though hospitals were required to
submit data on both Medicare and non-Medicare patients.
Additionally, the tests that CMS applied could detect incomplete
data for only a limited subset of hospitals, in contrast to its
assessment of data accuracy which covered all hospitals that
submitted data on six or more cases in a quarter. CMS acknowledged
it could assess completeness only for Medicare patients, but said
that by requiring hospitals to report an aggregate count of all
eligible patients, it would henceforth have the data needed to
assess the completeness of both Medicare and non-Medicare quality
data submissions. The agency stated it will use these data to
provide quarterly feedback to hospitals about the accuracy and
completeness of their data submissions, and require them to
explain discrepancies between the data they have submitted for the
APU program and the aggregate count of eligible patients they have
reported. CMS has not said that it will determine the magnitude of
underreporting for the program as a whole, as we recommended.
Additionally, by relying on the hospitals themselves to supply
data on the number of non-Medicare patients, CMS's proposed
approach lacks an independent verification of the completeness of
submitted data. This contrasts with the practice of most of the
other reporting systems we contacted, as well as experts in the
field, who generally underscored the importance of independently
checking both the accuracy and the completeness of the quality
data.
As arranged with your offices, unless you publicly announce its
contents earlier, we plan no further distribution of this report
until 30 days after its issue date. At that time, we will send
copies of this report to the Administrator of CMS and other
interested parties. We will also make copies available to others
on request. In addition, the report will be available at no charge
on GAO's Web site at http://www.gao.gov .
If you or your staffs have any questions about this report, please
contact me at (202) 512-7101 or [email protected]. Contact points
for our Offices of Congressional Relations and Public Affairs may
be found on the last page of this report. GAO staff who made major
contributions to this report are listed in appendix V.
Cynthia A. Bascetta Director, Health Care
To determine the processes used by the Centers for Medicare &
Medicaid Services (CMS) to ensure the accuracy and completeness of
data submitted by hospitals for the Annual Payment Update program
(APU program), we interviewed both CMS officials and staff at
DynKePRO-which operates the Clinical Data Abstraction Center
(CDAC)-and the Iowa Foundation for Medical Care (IFMC), two
contractors that perform data collection and data quality
monitoring tasks for the APU program. In addition, we reviewed
documentation on the program available publicly on the Quality Net
Exchange Web site1 and the Web sites of several quality
improvement organizations (QIO)-contractors to CMS that provide
technical assistance to hospitals on the APU program-as well as
documents on the APU program provided to us at our request by CMS.
We also obtained access to CMS's intranet system and searched for
relevant memorandums and other documents regarding CMS's policies
and requirements for hospitals that participated in the APU
program. To gain insights from other groups involved in the APU
program, we interviewed officials from two or more QIOs, state
hospital associations, and hospital data vendors that submitted
data to the IFMC-operated database for their hospital clients.
Our assessment of the baseline accuracy of the initial APU program
data depended on the availability of suitable information from
CMS. We examined CMS's reabstraction process to determine if the
CDAC assessments of data accuracy would be appropriate for that
purpose. Reabstraction is the re-collection of clinical data for
the purpose of assessing the accuracy of data abstractions
performed by hospitals. In the APU program, CDAC compares data
reported by the hospitals to those it has independently obtained
from the same medical records. CDAC has instituted a range of
procedures, including training of its abstractors and continuous
monitoring of interrater reliability, intended to ensure that its
abstractors understand and follow its detailed guidance for
arriving at abstraction determinations that are correct in terms
of CMS's data specifications. We interviewed CDAC staff and
observed the implementation of these procedures during a site
visit at the CDAC facility. On the basis of this information we
concluded that it would be appropriate for us to use the results
of the CDAC reabstractions to estimate baseline data accuracy for
the APU program.
We obtained the results of the reabstractions that CDAC had
conducted on samples of the patients for whom hospitals had
submitted data from the first two quarters of 2004. These two
quarters were the first two data submissions made by hospitals
under the APU program and the most recent available when we
conducted these analyses. They constituted 20,465 patient records
for the first quarter and 20,259 for the second. These files
showed, for each data element that CMS used in assessing
abstraction accuracy, the correct entry as determined by the CDAC
abstractors and whether this matched the value that the hospital
had reported. We applied CMS's algorithms for computing hospital
scores on the expanded-measure set in order to determine the
extent of missing or invalid data. We found that approximately 2
to 3 percent of patient records could not be scored on any given
APU measure due to missing data. We excluded from the analysis
records from critical access hospitals and acute care hospitals in
Maryland and Puerto Rico (which are paid under different payment
systems than other acute care hospitals and therefore are not
subject to a reduced annual payment update under the APU program2)
and a small number of records not related to the three medical
conditions covered by the APU program.3
Next we applied the scoring rules developed by CMS to assess the
accuracy of hospital abstractions. We calculated the accuracy
score for each hospital in each quarter, using the data elements
needed for the APU-measure set and, separately, for the
expanded-measure set. Accuracy scores for the expanded-measure set
are based on all the measures for which a hospital submitted data,
which could range from the APU measures alone to a maximum of
17-the 10 measures in the APU-measure set plus the 7 additional
measures adopted by the Hospital Quality Alliance for hospital
discharges through the second calendar quarter of 2004. These
scores represented the proportion of data elements where CDAC and
the hospital agreed, summing across all the assessed data elements
for the five sampled cases. We then calculated the distribution of
those scores, and the proportion of hospitals that met or exceeded
the 80 accuracy threshold that CMS had set. Next we calculated the
confidence interval for each of those accuracy scores, using the
formula that CMS had selected for that purpose. However, whereas
CMS applied a one-tailed test-passing any hospital that had a
confidence interval whose upper bound reached 80 or above-we
applied a two-tailed test to assess the statistical uncertainty
attached to both passing and failing the threshold. The one-tailed
test that CMS applied prevented hospitals from losing their full
annual payment update on the basis of their accuracy score if
there was less than a 95 percent probability that a score below 80
would have remained below 80 in another sample. This meant that
hospitals with large confidence intervals could have accuracy
scores well below 80 and still pass the CMS accuracy requirement.
Our analysis focused instead on assessing the level of statistical
certainty for all the accuracy scores, both above and below the 80
percent threshold. We sought to identify passing as well as
failing scores that could have changed with another sample. To do
so, we applied a two-tailed test and observed whether a hospital's
confidence interval bracketed the 80 percent threshold.
To provide descriptive information about variation in the accuracy
scores obtained by hospitals in different situations, we collected
additional information about the hospitals from other sources.
From the Medicare Provider of Services file we obtained the Social
Security Administration metropolitan statistical area code
(referred to as the SSA MSA code) and Social Security
Administration metropolitan statistical area size code (referred
to as the SSA MSA size code) to distinguish between urban and
rural hospitals. We also obtained from that source the total
number of Medicare-certified beds in order to categorize hospitals
by size. To compare the accuracy scores of hospitals that employed
different data vendors, we obtained from IFMC the identification
codes (but not the names) of the various data vendors certified by
the Joint Commission on Accreditation of Healthcare Organizations
(JCAHO) that had submitted to the clinical warehouse data for the
APU program on behalf of hospitals they served. Those codes were
also available in the case tracking information for the patient
records in the CDAC database. We then identified for each CDAC
reabstraction whether the case had originally been submitted by a
JCAHO-certified data vendor, and if so, which one. These data were
aggregated to generate accuracy scores for each hospital that
consistently submitted its quality data through one data vendor in
a given quarter. This allowed us to determine the proportion of
hospitals served by each JCAHO data vendor that met CMS's 80
percent accuracy threshold. We also calculated the proportion of
hospitals that submitted their own quality data to CMS (identified
in the CDAC case tracking information by the hospital's Medicare
provider ID number) that met the accuracy threshold. Although this
analysis was limited to data vendors that were JCAHO-certified,
those vendors collectively submitted data to the clinical
warehouse for 78 to 79 percent of the hospitals we analyzed in the
two baseline quarters. Another 13 to 14 percent of hospitals
directly submitted their own data, and we do not have information
on how the remaining hospitals submitted data to the clinical
warehouse.
As was the case for our baseline accuracy assessment, our
assessment of the baseline completeness of the data submitted for
the APU program depended on the availability of suitable data from
CMS. Specifically, we considered using CMS's estimates of minimum
expected cases derived from Medicare claims data to arrive at
estimates of baseline completeness. The CMS officials we spoke
with noted that there were numerous reasons why the two data
sources-quality data submissions for the APU program and cases
billed to Medicare-would be expected to diverge, apart from any
underreporting of quality data by hospitals. The claims data were
limited to Medicare fee-for-service patients, whereas the
hospitals were obliged to submit quality data on all patients over
18 years of age (over 28 days old for most pneumonia measures),
including patients belonging to Medicare health maintenance
organizations. In addition, hospitals with large numbers of cases
could draw samples for the quality data, but would bill for all
patients. In making adjustments to its number of "expected cases"
to take account of sampling, CMS found that it could not reliably
identify the hospitals that should have followed the JCAHO
sampling rules, which would result in larger-sized samples.
Therefore, in calculating the number of cases it expected
hospitals to have submitted to the clinical warehouse, CMS applied
to all hospitals across the board the expectation of smaller
samples based on rules that pertained to hospitals not accredited
by JCAHO. Finally, the Medicare data used for the comparison was
an average volume recorded over the previous 2 years, not claims
filed for the quarter to which the quality data applied.
We found that these limitations precluded our using information
from CMS's Medicare claims analysis to assess the baseline
completeness of the data submitted by hospitals for the APU
program. CMS's comparison of hospital quality data submissions to
the clinical warehouse to its estimated number of "expected cases"
might have served CMS's purposes, by identifying at least some
instances of significant discrepancy between the number of cases
for which quality data were submitted and claims filed. However,
we determined that it would not provide a reasonable estimate of
the magnitude of data completeness for all hospitals. Because the
limitations in the CMS analysis would affect some hospitals more
than others, depending on how many non-Medicare patients a
hospital treated and whether it applied the JCAHO sampling rules,
we concluded that using information from this analysis to estimate
baseline data completeness would lead to results that were
distorted by the uneven impact of those factors on the information
produced for different hospitals.
To obtain information on other processes that could be used to
check data accuracy and completeness, we interviewed officials
from organizations that administer reporting systems that collect
clinical performance data. To select these organizations, we took
several steps. We reviewed reports on reporting systems, including
two issued by QIOs: IPRO's 2003 Review of Hospital Quality Reports
and Delmarva Foundation's The State-of-the-Art of Online Hospital
Public Reporting: A Review of Forty-Seven Websites.4 We solicited
input from the authors of each report and interviewed academic
researchers who have researched methods of assessing the
reliability of performance data. We used on-line resources to
obtain information on federal- and state-administered surveillance
efforts. Our selection criteria focused on systems that collected
clinical data, as opposed to administrative or claims data, and
that were mentioned most often in the reports and interviews cited
above. To ensure variation, we selected a mix of systems,
including those run by public and private organizations, those
receiving data from hospitals and those receiving data from other
types of providers, and those collecting data across a range of
medical conditions and those collecting data on specific medical
conditions. Using a structured protocol, we interviewed officials
from the following organizations: JCAHO, National Committee for
Quality Assurance, Society of Thoracic Surgeons, California Office
of Statewide Health Planning and Development, New York State
Department of Health, CMS (the units responsible for monitoring
nursing home care regarding the Data Assessment and Verification
Project (DAVE) contract), and the American College of Cardiology.
Each organization reviewed and confirmed the accuracy of the
information presented in appendix II.
Our analysis is based on the quality measures established for the
APU program and the information available as of September 2005 on
the accuracy and completeness of data submitted by hospitals for
that program. We did not evaluate the appropriateness of these
quality measures relative to others that could have been selected.
Nor did we examine the actual performance by hospitals on the
measures (e.g., how often they provide a particular service or
treatment). Our analysis of the baseline level of accuracy and
completeness of data submitted for the APU program is based on the
procedures developed by CMS to validate the data submitted. We
have not independently compared the data submitted by hospitals to
the original patient clinical records.
We conducted our work from November 2004 through January 2006 in
accordance with generally accepted government auditing standards.
Agency Comments
Appendix I: Scope and Methodology Appendix I: Scope and Methodology
1We downloaded various documents from the www.qnetexchange.org Web site
between December 21, 2004, and January 10, 2006.
2CMS included hospitals in Puerto Rico in its list of hospitals qualifying
for the full fiscal year 2005 update, but determined in conjunction with
the fiscal year 2006 payment update decision that Puerto Rico's hospitals
were exempt from the APU program requirements. Hospitals in Puerto Rico
receive prospective payments from Medicare, but under a different system
than other hospitals.
3The records we excluded were 536 surgery cases for the first quarter and
604 surgery cases for the second quarter, from hospitals providing data on
surgical infection prevention measures.
4IPRO, 2003 Review of Hospital Quality Reports for Health Care Consumers,
Purchasers and Providers (Lake Success, N.Y.: October 2003); Delmarva
Foundation and the Joint Commission on Accreditation of Healthcare
Organizations, The State-of-the-Art of Online Hospital Public Reporting: A
Review of Forty-Seven Websites (Easton, Md.: September 2004).
Appendix II: Other Reporting Systems Appendix II: Other Reporting Systems
Table 3: Background Information on CMS and Other Reporting Systems
Other reporting
systems
California
Office of Data National
Centers for Statewide Assessment Joint Commission Committee
Medicare & Health and on Accreditation for New York Society of
Medicaid American College Planning Verification of Healthcare Quality State Thoracic
Services of Cardiology and Project Organizations Assurance Department Surgeons
(CMS) (ACC) Development (DAVE)a (JCAHO)b (NCQA) of Health (STS)
Organization Private, Private, Private, Private,
status Public nonprofit Public Public nonprofit nonprofit Public nonprofit
Data Hospitals Facilities with Hospitals Nursing JCAHO-accredited Health Hospitals Hospitals,
submitted by paid under at least one where homes hospitals plans that perform surgeons
the catheterization cardiac cardiac
Inpatient laboratory surgeries surgery
Prospective (includes are and/or
Payment in-hospital, performed percutaneous
System freestanding, coronary
and/or mobile intervention
catheterization (PCI)
laboratories)
Reporting c Voluntaryd Mandatory Mandatory Mandatorye Mandatorye Mandatory Voluntary
requirement
Are the data Yes No Yes Yes Yes Yesf Yes No
publicly
reported?
Types of Cardiac - Cardiac Cardiac Resident Cardiac - AMI, Preventive Cardiac - Cardiac -
conditions acute -diagnostic -coronary health care HF care, CABG, PCI, CABG,
for which myocardial cardiac artery acute and and valve aortic and
data are infarction catheterization, bypass Resident Pneumonia chronic surgery mitral
submitted (AMI), PCI grafting health conditions valve
heart (CABG) status Pregnancy
failure General
(HF) Surgical thoracic
infection surgery
Pneumonia prevention
Congenital
heart
surgery
Number of 3,839g 611h 120 16,266i ~3,350 560 49 700
facilities
reporting
Approximate 2 years 7 years 2 yearsj 1 year 3 years 14 years 16 years 16 years
program
duration
Sources: CMS, ACC, California Office of Statewide Health Planning and
Development, JCAHO, NCQA, New York State Department of Health, and STS.
aDAVE is a CMS contract to assess the reliability of minimum data set
assessment data that are submitted by nursing homes. Minimum data set
assessments are a minimum data set of core elements to use in conducting
comprehensive assessments of patient conditions and care needs. These
assessments are collected for all residents in nursing homes that serve
Medicare and Medicaid beneficiaries.
bJCAHO provided information about its ORYX(R) initiative, which integrates
outcome and other performance measurement data into the accreditation
process.
cUnder Section 501(b) of the Medicare Prescription Drug, Improvement, and
Modernization Act of 2003, hospitals shall submit data for a set of
indicators established by the Department of Health and Human Services
(HHS) as of November 1, 2003, related to the quality of inpatient care.
Section 501 (b) also provides that any hospital that does not submit data
on the 10 quality measures specified by the Secretary of Health and Human
Services will have its annual payment update reduced by 0.4 percentage
points for each fiscal year from 2005 through 2007.
dSome states and insurance companies have started to require hospital
participation.
eData submission is mandatory to maintain accreditation.
fOnly audited data are publicly reported.
gThe number of hospitals that submitted data to receive their annual
payment update for fiscal year 2005.
hThe number of facilities enrolled in ACC's National Cardiovascular Data
Registry(R) as of July 13, 2005.
iThis number represents the number of nursing homes that submitted minimum
data set assessments between January 1, 2004, and December 31, 2004.
Accuracy estimates are made by selecting a random sample of records for
off-site and on-site medical record review.
jMandatory reporting of performance data began in 2003.
Table 4: Processes Used by CMS and Other Reporting Systems to Ensure Data
Accuracy
Other
reporting
Centers systems
for California Data Joint National
Medicare Office of Assessment Commission on Committee Society
& American Statewide and Accreditation for New York of
Medicaid College of Health Verification of Healthcare Quality State Thoracic
Services Cardiology Planning and Project Organizations Assurance Department Surgeons
(CMS) (ACC) Development (DAVE)a (JCAHO)b (NCQA) of Health (STS)
Processes
Training 0M 0M 0M 0M 0M 0M 0M 0M
Standardized 0Mc 0M 0M 0M 0Mc 0M 0M 0M
measures or
definitions
Standardized 0M 0M 0M 0M 0M 0M 0M 0M
processes
for data
collection
Automated 0M 0M 0M 0Md 0M 0M 0M 0M
data edits
when the
data come in
as part of
the data
quality
assurance
process
(identify
missing or
out-of-range
data)
Independent 0M 0M 0M 0M e 0M 0M f
audits
On-site 0M 0M 0M 0M 0M f
audits
Medical 0M 0M 0M 0M 0M 0M f
record
review
Sample size
Patients 5 10% random 70 records 13-16 Not 60 50 j
records sample of recordsh applicable records recordsi
medical
records,
50 record
minimumg
Facilities All 10% random Outliers and 69 Not All 20 m
sample of near-outliers applicable programs/
eligible for mortality yearl
sitesk
Sources: CMS, ACC, California Office of Statewide Health Planning and
Development, JCAHO, NCQA, New York State Department of Health, and STS.
aDAVE is a CMS contract to assess the reliability of minimum data set
assessment data that are submitted by nursing homes. Minimum data set
assessments are a minimum data set of core elements to use in conducting
comprehensive assessments of patient conditions and care needs. These
assessments are collected for all residents in nursing homes that serve
Medicare and Medicaid beneficiaries.
bJCAHO provided information about its ORYX(R) initiative, which integrates
outcome and other performance measurement data into the accreditation
process.
cCMS and JCAHO have worked to align their measures. A common set of
measures took effect for discharges occurring on or after January 1, 2005.
dData checks occur at the state level, for example, the state health
department, before the data are accessed by DAVE.
eJCAHO performs independent audits of data vendors.
fSTS is planning to incorporate an independent audit into its system. STS
officials plan on including an on-side audit and medical record review as
part of their audit system.
gThe 10 percent random sample of medical records is based on annual
percutaneous coronary intervention volume.
hThe number of cases and facilities identified are limited to on-site
audits. Additional cases are reviewed as part of the off-site medical
record review process.
iAuditors review 100 percent of records when significant discrepancies are
identified between the chart and what the hospital reported on specific
risk factors. In addition, medical record documentation is reviewed for
100 percent of cases with the risk factors "shock" or "stent thrombosis".
jSTS plans to review a minimum of 30 records as a part of its independent
auditing process.
kACC defines eligible sites as those facilities with a minimum of 50
records to be abstracted over a specified number of quarters.
lNew York State Department of Health typically reviews 20 programs per
year. In some instances that can mean percutaneous coronary intervention
and cardiac surgery at the same hospital, which would count as two
programs.
mSTS plans on visiting 24 facilities per year as a part of its independent
auditing process.
Table 5: Processes Used by CMS and Other Reporting Systems to Ensure Data
Completeness
Other
reporting
systems
Centers California
for Office of Data Joint National
Medicare Statewide Assessment Commission on Committee Society
& American Health and Accreditation for New York of
Medicaid College of Planning Verification of Healthcare Quality State Thoracic
Services Cardiology and Project Organizations Assurance Department Surgeons
(CMS) (ACC) Development (DAVE)a (JCAHO)b (NCQA) of Health (STS)
Processes
Training 0M 0M 0M 0M 0M 0M
Concurrent 0M 0M 0M
reviewc
Independent 0M 0M 0M d 0M 0M e
audits
On-site 0M 0M 0M 0M 0M e
audits
Comparison 0M 0M 0M 0M 0M 0M 0M 0M
to another
data source
Data sources
Billing Medicare Hospital ICD-9 Medicare ICD-9 codesf Statewide Medicare
claims billing codesf claims data planning provider
data records and analysis
research and
cooperative review
system (MEDPAR)
(SPARCS) data
data
Other Patient State death Resident Pharmacy
medical files rosters records,
records, laboratory
catherization records
laboratory
logs;
physician
notes
Frequency of Twiceg Annuallyh Annually Oncei Annuallyj Annually Annually Oncek
data
completeness
review
Sources: CMS, ACC, California Office of Statewide Health Planning and
Development, JCAHO, NCQA, New York State Department of Health, and STS.
aDAVE is a CMS contract to assess the reliability of minimum data set
assessment data that are submitted by nursing homes. Minimum data set
assessments are a minimum data set of core elements to use in conducting
comprehensive assessments of patient conditions and care needs. These
assessments are collected for all residents in nursing homes that serve
Medicare and Medicaid beneficiaries.
bJCAHO provided information about its ORYX(R) initiative, which integrates
outcome and other performance measurement data into the accreditation
process.
cUnder concurrent review, auditors assess data as they are being
collected.
dJCAHO performs independent audits of data vendors.
eSTS is planning to incorporate an independent audit into its system. STS
officials plan on including an on-side audit as part of their audit
system.
fThe International Classification of Diseases, Ninth Revision (ICD-9)
codes were designed to promote international comparability in the
collection, processing, classification, and presentation of mortality
statistics.
gCMS conducted two separate one-time studies that compared Medicare claims
data to submitted data.
hData completeness reviews are conducted annually for randomly selected
sites as part of the on-site audit process and quarterly for data
submissions.
iA one-time study was conducted; additional studies are planned.
jAt a minimum, data completeness reviews are conducted annually.
kA one-time study was conducted.
Appendix III: Data Tables on Hospital Accuracy Scores Appendix III: Data
Tables on Hospital Accuracy Scores
Rural hospitals and smaller hospitals generally received accuracy scores
that differed minimally from those of urban hospitals and larger
hospitals. (See tables 6 and 7.) To the extent there are small differences
across categories, they do not show a consistent pattern based on
geographic location or size.
Table 6: Median Hospital Baseline Accuracy Scores, by Hospital
Characteristic, Quarter, and Measure Set
April-June 2004
January-March 2004 discharges discharges
Median Median
accuracy Median accuracy accuracy Median accuracy
score for score for score for score for
Hospital APU-measure expanded-measure APU-measure expanded-measure
characteristic set set set set
Urban 92.7 90.0 94.2 91.5
Rural 93.0 91.1 93.8 91.7
< 50 beds 93.0 91.2 93.9 91.8
50-99 beds 93.2 91.1 94.2 92.2
100-199 beds 92.9 90.5 94.1 91.3
200-299 beds 93.0 90.1 94.2 91.7
300-399 beds 92.7 89.8 93.9 91.0
400-499 beds 92.0 89.5 93.8 91.1
500+ beds 92.0 89.0 94.1 91.0
All hospitals 92.9 90.4 94.1 91.6
Source: GAO analysis of CMS data.
Note: Calculation of accuracy scores for the expanded-measure set was
based on all the measures for which a hospital submitted data, which could
range from the APU measures alone to a maximum of 17-the APU measures plus
as many as 7 additional measures.
Table 7: Proportion of Hospitals with Baseline Accuracy Scores Not Meeting
80 Percent Threshold, by Hospital Characteristic, Quarter, and Measure Set
April-June
2004
January-March 2004 discharges discharges
Percentage Percentage
not meeting Percentage not not meeting Percentage not
threshold meeting threshold meeting
for threshold for for threshold for
Hospital APU-measure expanded-measure APU-measure expanded-measure
characteristic set set set set
Urban 10.3 14.4 7.7 10.3
Rural 9.1 11.6 8.9 9.6
< 50 beds 9.4 12.8 10.3 12.0
50-99 beds 9.6 12.4 8.3 8.5
100-199 beds 8.7 12.3 8.6 9.8
200-299 beds 9.5 12.8 6.0 9.3
300-399 beds 11.8 15.0 6.5 8.6
400-499 beds 10.6 14.1 8.1 11.1
500+ beds 12.2 16.6 8.6 12.2
All hospitals 9.8 13.2 8.2 10.0
Source: GAO analysis of CMS data.
Note: Calculation of accuracy scores for the expanded-measure set was
based on all the measures for which a hospital submitted data, which could
range from the APU measures alone to a maximum of 17-the APU measures plus
as many as 7 additional measures. CMS deems hospitals that achieve an
accuracy score of 80 or better as having met its requirement to submit
accurate data.
Accuracy scores among hospitals whose data were submitted to CMS by
different JCAHO-certified vendors varied more, especially in the
percentage of the hospitals that failed to meet the 80 percent threshold.
(See table 8.) Collectively, these data vendors submitted data to the
clinical warehouse for approximately 78 to 79 percent of hospitals
affected by the APU program in the two baseline quarters, while another 13
to 14 percent of hospitals directly submitted their own data. For large
data vendors (serving more than 100 hospitals), medium vendors (serving
between 20 and 100 hospitals), and small vendors (serving fewer than 20
hospitals), there was marked variation within each size grouping in the
proportion of the vendors' hospitals that did not meet the accuracy
threshold. Such variation could reflect differences in the hospitals
served by different vendors as well as differences in the services
provided by those vendors.
Table 8: Percentage of Hospitals with Baseline Accuracy Scores Not Meeting
80 Percent Threshold, by JCAHO-Certified Vendor Grouped by Number of
Hospitals Served, Quarter, and Measure Set
Percentage not meeting
Vendors, Percentage not meeting threshold for
grouped by threshold for APU-measure set expanded-measure set
number of April-June April-June
hospitals January-March 2004 2004 January-March 2004
served discharges discharges 2004 discharges discharges
Large vendors
Vendor 1 2.6 2.6 3.9 2.6
Vendor 2 7.1 7.2 9.3 7.2
Vendor 3 7.7 9.5 14.0 11.3
Vendor 4 10.1 9.8 11.1 10.2
Vendor 5 11.1 8.4 14.4 10.4
Vendor 6 12.2 10.4 16.5 11.3
Vendor 7 12.4 9.0 12.4 13.6
Vendor 8 13.3 5.8 15.8 7.9
Medium
vendors
Vendor 9 2.4 4.5 2.4 2.3
Vendor 10 3.4 3.1 3.4 6.3
Vendor 11 4.2 6.8 6.9 6.8
Vendor 12 4.8 4.8 4.8 6.5
Vendor 13 4.9 2.8 4.9 2.8
Vendor 14 6.4 4.3 8.5 6.4
Vendor 15 7.1 6.0 7.1 7.5
Vendor 16 7.6 5.0 19.0 13.8
Vendor 17 7.9 2.6 9.2 2.6
Vendor 18 8.0 3.4 12.0 6.9
Vendor 19 8.8 2.9 26.5 8.8
Vendor 20 12.1 5.5 17.6 7.7
Vendor 21 13.5 5.6 13.5 8.3
Vendor 22 15.2 13.9 17.7 17.7
Vendor 23 18.4 10.0 28.6 12.0
Small vendors
Vendor 24 0.0 11.8 0.0 11.8
Vendor 25 0.0 7.1 0.0 7.1
Vendor 26 0.0 0.0 0.0 0.0
Vendor 27 0.0 16.7 0.0 16.7
Vendor 28 0.0 0.0 0.0 0.0
Vendor 29 0.0 0.0 0.0 0.0
Vendor 30 0.0 0.0 0.0 0.0
Vendor 31 8.3 0.0 16.7 0.0
Vendor 32 9.1 8.3 9.1 16.7
Vendor 33 9.1 0.0 27.3 0.0
Vendor 34 10.0 9.1 10.0 9.1
Vendor 35 11.1 11.1 11.1 11.1
Vendor 36 20.0 33.3 60.0 33.3
Vendor 37 33.3 0.0 33.3 0.0
Vendor 38 33.3 0.0 33.3 0.0
No vendor 10.2 12.5 11.6 13.2
Source: GAO analysis of CMS data.
Note: Large vendors served more than 100 hospitals, medium vendors served
20 to 100 hospitals, and small vendors served fewer than 20 hospitals.
Calculation of accuracy scores for the expanded-measure set was based on
all the measures for which a hospital submitted data, which could range
from the APU measures alone to a maximum of 17-the APU measures plus as
many as 7 additional measures. CMS deems hospitals that achieve an
accuracy score of 80 or better as having met its requirement to submit
accurate data.
Rank ordering hospitals by the breadth of the confidence intervals around
their accuracy scores, from the narrowest to the widest intervals, shows
the large variation that we found across both quarters and measure sets.
Hospitals with the narrowest confidence intervals, shown in table 9 as the
10th percentile, had a range of no more than 6 percentage points between
the lower and upper limits of their confidence interval. That meant that
their accuracy scores from one sample to the next were likely to vary by
no more than plus or minus 3 percentage points from the accuracy score
obtained in the sample drawn by CMS. By contrast, hospitals with the
widest confidence intervals, shown in table 9 as the 90th percentile,
exceeded 36 percentage points from the lower limit to the upper limit of
their confidence interval. The accuracy scores for these hospitals would
likely vary from one sample to the next by 18 percentage points or more,
up or down, relative to the accuracy score derived from the CMS sample.
For hospitals whose confidence interval included the 80 percent threshold,
it was statistically uncertain whether a different sample of cases would
have altered their result from passing the 80 percent threshold to
failing, or vice versa.
Table 9: Breadth of Confidence Intervals in Percentage Points Around the
Hospital Baseline Accuracy Scores at Selected Percentiles, by Measure Set
and Quarter
Hospital
percentiles Expanded-measure
from narrowest APU-measure set set
to widest January-March April-June January-March
confidence 2004 2004 2004 April-June 2004
intervals discharges discharges discharges discharges
10th percentile 5.4 0.0 6.0 5.6
25th percentile 8.1 7.3 9.3 8.2
Median 14.0 11.8 14.6 13.0
75th percentile 24.2 21.5 23.6 21.3
90th percentile 40.3 41.0 37.9 36.8
Source: GAO analysis of CMS data.
Note: Confidence interval based on a 95 percent significance level.
Calculation of accuracy scores and confidence intervals for the
expanded-measure set was based on all the measures for which a hospital
submitted data, which could range from the APU measures alone to a maximum
of 17-the APU measures plus as many as 7 additional measures.
One-third to one-fourth of hospitals had statistically uncertain results
because their confidence interval extended both above and below the 80
percent threshold. Some of these hospitals had accuracy scores of 80 or
above and some had scores of less than 80. Table 10 separates these
hospitals into (1) those that had accuracy scores equal to 80 or above and
were statistically uncertain and (2) those that had accuracy scores below
80 and were statistically uncertain. The table shows that most of the
statistical uncertainty involved hospitals that passed CMS's accuracy
threshold, but if a different sample of cases had been reabstracted by
CDAC, there was a substantial possibility that they would not have passed.
Table 10: For Hospitals with Confidence Intervals That Included the 80
Percent Threshold, Percentage of Total Hospitals with an Actual Baseline
Accuracy Score That Either Met or Failed to Meet the Threshold, by Measure
Set and Quarter
Expanded-measure
APU-measure set set
January-March April-June January-March
2004 2004 2004 April-June 2004
discharges discharges discharges discharges
Percentage of
hospitals whose
actual accuracy
score equals 80
or better 23.9 19.2 28.0 24.0
Percentage of
hospitals whose
actual accuracy
score equals
less than 80 8.3 7.0 11.3 8.7
Total 32.2 26.3 39.2 32.7
Source: GAO analysis of CMS data.
Note: Confidence interval based on a 95 percent significance level.
Calculation of accuracy scores for the expanded-measure set was based on
all the measures for which a hospital submitted data, which could range
from the APU measures alone to a maximum of 17-the APU measures plus as
many as 7 additional measures. CMS deems hospitals that achieve an
accuracy score of 80 or better as having met its requirement to submit
accurate data.
Appendix IV: Comments from the Centers for Medicare & Medicaid Services
Appendix IV: Comments from the Centers for Medicare & Medicaid Services
Appendix V: A Appendix V: GAO Contact and Staff Acknowledgments
GAO Contact
Cynthia A. Bascetta (202) 512-7101 or [email protected]
Acknowledgments
In addition to the contact named above, Linda T. Kohn, Assistant Director;
Ba Lin; Nkeruka Okonmah; Eric A. Peterson; Roseanne Price; and Jessica C.
Smith made key contributions to this report.
(290403)
GAO's Mission
The Government Accountability Office, the audit, evaluation and
investigative arm of Congress, exists to support Congress in meeting its
constitutional responsibilities and to help improve the performance and
accountability of the federal government for the American people. GAO
examines the use of public funds; evaluates federal programs and policies;
and provides analyses, recommendations, and other assistance to help
Congress make informed oversight, policy, and funding decisions. GAO's
commitment to good government is reflected in its core values of
accountability, integrity, and reliability.
Obtaining Copies of GAO Reports and Testimony
The fastest and easiest way to obtain copies of GAO documents at no cost
is through GAO's Web site ( www.gao.gov ). Each weekday, GAO posts newly
released reports, testimony, and correspondence on its Web site. To have
GAO e-mail you a list of newly posted products every afternoon, go to
www.gao.gov and select "Subscribe to Updates."
Order by Mail or Phone
The first copy of each printed report is free. Additional copies are $2
each. A check or money order should be made out to the Superintendent of
Documents. GAO also accepts VISA and Mastercard. Orders for 100 or more
copies mailed to a single address are discounted 25 percent. Orders should
be sent to:
U.S. Government Accountability Office 441 G Street NW, Room LM Washington,
D.C. 20548
To order by Phone: Voice: (202) 512-6000 TDD: (202) 512-2537 Fax: (202)
512-6061
To Report Fraud, Waste, and Abuse in Federal Programs
Contact:
Web site: www.gao.gov/fraudnet/fraudnet.htm E-mail: [email protected]
Automated answering system: (800) 424-5454 or (202) 512-7470
Congressional Relations
Gloria Jarmon, Managing Director, [email protected] (202) 512-4400 U.S.
Government Accountability Office, 441 G Street NW, Room 7125 Washington,
D.C. 20548
Public Affairs
Paul Anderson, Managing Director, [email protected] (202) 512-4800 U.S.
Government Accountability Office, 441 G Street NW, Room 7149 Washington,
D.C. 20548
www.gao.gov/cgi-bin/getrpt? GAO-06-54 .
To view the full product, including the scope
and methodology, click on the link above.
For more information, contact Cynthia A. Bascetta, (202) 512-7101 or
[email protected].
Highlights of GAO-06-54 , a report to the Committee on Finance, U.S.
Senate
January 2006
HOSPITAL QUALITY DATA
CMS Needs More Rigorous Methods to Ensure Reliability of Publicly Released
Data
The Medicare Modernization Act of 2003 directed that hospitals lose 0.4
percent of their Medicare payment update if they do not submit clinical
data for both Medicare and non-Medicare patients needed to calculate
hospital performance on 10 quality measures. The Centers for Medicare &
Medicaid Services (CMS) instituted the Annual Payment Update (APU) program
to collect these data from hospitals and report their rates on the
measures on its Hospital Compare Web site.
For hospital quality data to be useful to patients and other users, they
need to be reliable, that is, accurate and complete. GAO was asked to (1)
describe the processes CMS uses to ensure the accuracy and completeness of
data submitted for the APU program, (2) analyze the results of CMS's audit
of the accuracy of data from the program's first two calendar quarters,
and (3) describe processes used by seven other organizations that assess
the accuracy and completeness of clinical performance data.
What GAO Recommends
GAO recommends that CMS take steps to improve its processes for ensuring
the accuracy and completeness of hospital quality data. In commenting on a
draft of this report, CMS agreed to implement steps to improve the quality
and completeness of the data.
CMS has contracted with an independent medical auditing firm to assess the
accuracy of the APU program data submitted by hospitals, but has no
ongoing process in place to assess the completeness of those data. CMS's
independent audit checks accuracy by comparing the quality data submitted
by hospitals from the medical records for a sample of five patients per
calendar quarter for each hospital to the quality data that the contractor
has reabstracted from the same records. The data are deemed to be accurate
if there is 80 percent or greater agreement between these two sets of
results. CMS has established no ongoing process to check data
completeness. For the payment updates for fiscal years 2005 and 2006, CMS
compared the number of cases submitted by a hospital to the number of
Medicare claims that hospital submitted. However, these analyses did not
address non-Medicare patient records, and the approach that CMS took in
these analyses was not capable of detecting incomplete data for all
hospitals.
Although GAO found a high overall baseline level of accuracy when it
examined CMS's assessment of the data submitted for the first two quarters
of the APU program, the results are statistically uncertain for up to
one-third of hospitals, and a baseline level of data completeness cannot
be determined. The median accuracy score of 90 to 94 percent-depending on
the calendar quarter and measures used-was well above the 80 percent
accuracy threshold set by CMS, and about 90 percent of hospitals met or
exceeded that threshold for both the first and the second calendar
quarters of 2004. However, for approximately one-fourth to one-third of
all the hospitals that CMS assessed for accuracy, the statistical margin
of error for their accuracy score included both passing and failing
accuracy levels. Consequently, for these hospitals, the small number of
cases that CMS examined was not sufficient to establish with statistical
certainty whether they met the accuracy threshold set by CMS. With respect
to completeness of data, CMS did not assess the extent to which all
hospitals submitted data on all eligible patients, or a representative
sample thereof, for the two baseline quarters. As a result, there were no
data from which to derive an assessment of the baseline level of
completeness of the quality data that hospitals submitted for the APU
program.
Other reporting systems that collect clinical performance data have
adopted a range of activities to ensure data accuracy and completeness,
which include some methods employed by all, such as checking the data
electronically to identify missing data. Officials from some of the other
reporting systems and an expert in the field stressed the importance of
including an independent audit in the methods used by organizations to
check data accuracy and completeness. Most of the other reporting systems
incorporate three methods into their process that CMS does not use in its
independent audit. Specifically, most include an on-site visit in their
independent audit, focus their audits on a selected number of facilities,
and review a minimum of 50 patient medical records during the audit.
*** End of document. ***