Data

Hospital Cost Report (HCRIS) Data 1996-2021

See GitHub page for the latest information, the source data, and the processing code.

Processed records from the CMS hospital cost report data, called HCRIS (Healthcare Cost Report Information System). The output includes all cost reports from 2000-2019. For more information on this data, see the NBER site.

Report level data, 1996-2021 – In this file, each record is a hospital cost report. Hospitals can file multiple cost reports in the same year, covering different periods. The coverage periods will also depend on the hospital’s fiscal year, with some hospitals’ fiscal years beginning earlier in the year and others later in the year. Note that the 2021 data year is incomplete.

Synthetic calendar year by hospital level data, 1997-2020 – This file constructs synthetic calendar year data for the hospitals. For variables that are flows, it takes weighted sums over the cost reports, with the weights equal to the fraction of the cost report that fell into the calendar year (the weights are not normalized). For bed counts and the cost-to-charge ratio, it takes a weighted average using the same formula for the weights, but in this case the weights are normalized to sum to 1. Note that much of the 2020 data is incomplete.

There are a number of caveats to using HCRIS data and these files, which I review on the GitHub page.

Medicare Hospital Payment Rates for COVID-19 Inpatients

Download hospital payment rate data (Stata 16, Stata 12, and CSV formats)

This dataset shows the hospital payment rates for typical COVID-19 inpatients covered by Medicare. It was derived from Medicare pricer software. Each record shows the payment rate (and payment components) for a given hospital and diagnosis-related group (DRG).

The pricer was initialized with fake bills for patients discharged on May 5, 2020 with a COVID-19 ICD-10 diagnosis code (U07.1), which triggers the CARES Act operating payment bump of 20%. Direct graduate medical education (DGME) pass-thru payments are not included. The frame consists of Acute Inpatient Prospective Payment System (AIPPS) facilities, defined as any hospital in the FY2020 IPPS Impact file that could be successfully priced by the pricer, excluding Maryland hospitals. Critical access hospitals, which are paid on a different basis, are not included. The DRGs priced are those expected for most inpatients with a COVID-19 principal diagnosis. For more information on the DRG definitions, see the MS-DRG definitions manual. For more details on the Medicare hospital payment methodology, including the formulas that yield the total payment out of the payment components, see the Medicare Learning Network guide to the IPPS.

Hospital Data from CMS Provider of Services File 1993-2017

See GitHub page for the latest information, the source data, and the processing code.

These files are processed copies of the public CMS Provider of Services data. The source data is available from the NBER. This version includes records for all inpatient hospitals (including Maryland waiver hospitals and critical access facilities) from 1993-2017. It provides information on hospital name, address, location, teaching status (including residents and residency programs offered), subtype (short-term, long-term, critical access, etc.), type of control (non-profit, for-profit, government), and size (number of beds).

CMS Hospital Data (from Provider of Services), 1993-2017 – The main dataset, pos.dta, has a record for each hospital provider number in each year it appeared in the source data. Two additional files have one record per hospital provider number: one has only the record for the first year it appeared in the 1993-2017 window (pos_firstyear.dta), and the other has the record for the last year it appeared (pos_lastyear.dta). Includes data in Stata v15, Stata v12, and CSV formats, plus full variable descriptions for those not using Stata.

Notes: As in the source data from CMS, hospitals that close will persist in the data but will have a termination code and termination date. Hospitals that merge, change subtype, or change type of control will usually get a new provider number, which means their old provider number will terminate (and remain in the data). Keep this in mind when attempting to follow hospitals longitudinally. I have heard, but can’t confirm, that CMS rarely updates this data, so the hospital characteristics in it may be quite out of date.

For more information on the quirks of this data, please see the GitHub page.

Hospital Compare Data 2004-2016

See GitHub page for the latest information, the source data, and the processing code.

The files below include CMS Hospital Compare data for the years 2004-2016. I originally processed this data focusing on the year 2006-2008 values for the HCAHPS and the AMI, CHF, and pneumonia mortality/readmission and process of care (now called timely and effective care) scores. The tables that were added more recently (e.g. hospital associated infections, structural measures) are not included. If you would like to use this data in your own research, I urge you test it carefully.

Process of Care Scores, 2004-2016 – Shares of patients receiving evidence-based treatments for AMI, CHF, pneumonia, surgical care, and outpatient care. All-payer. (Includes data in Stata v15, Stata v12, and CSV formats, plus full variable descriptions for those not using Stata. Also includes full names of process measures.)

HCAHPS, 2007-2016 – Average scores for the aggregated questions in HCAHPS patient satisfaction survey. All-payer. (Includes data in Stata v15, Stata v12, and CSV formats, plus full variable descriptions for those not using Stata. Also includes listing of numeric values I assigned to question responses.)

Mortality (2007-2016) and Readmission (2008-2016) – Estimates of AMI, CHF, and pneumonia mortality and readmission rates. Medicare FFS patients only. (Includes data in Stata v15, Stata v12, and CSV formats, plus full variable descriptions for those not using Stata.)