Skip to Main Content


This JAMA Guide to Statistics and Methods describes the Medicare claims data available from the Centers for Medicare and Medicaid Services and its use in comparative effectiveness research and health policy analysis.

The Centers for Medicare and Medicaid Services (CMS) administers Medicare, the primary US health insurance program for people aged 65 years and older and people who qualify for Social Security Administration disability benefits (Box 5). Medicare includes Part A, which is hospital insurance; Part B, medical insurance; Part C, Medicare Advantage (private health insurance approved by the CMS and paid on a per-capita basis); and Part D, prescription drug coverage. The CMS maintains and makes several data files available for purchase ( Because the Medicare Advantage data are administered by private insurance, these claims are unavailable. However, national fee-for-service Medicare files are available and represent approximately 70% of beneficiaries.

BOX 5 Attributes of Medicare Claim Data

  1. Medicare data are an excellent national representation of a large proportion of the older adult population.

  2. While a cost-effective way of evaluating a large population, securing independent funding to purchase the data is highly recommended.

  3. The data sets available from the Centers for Medicare and Medicaid Services are suitable for linkages to several existing data sets (ie, American Heart Association, US Census, and others).

  4. Data can be tracked longitudinally across episodes of care, making this a uniquely positioned data set to study long-term outcomes in surgical patients.

  5. Several advanced statistical methods can increase the robustness of inferences made using this data; inclusion of experienced methodologists in research is highly recommended.

Medicare data represent claims submitted to the CMS for reimbursement of services rendered. The Medicare data set has very little missing data because accurate claims are necessary for hospital and physician payments.


Several features make this data set a useful research tool. First, specific demographic data are included (eg, age, birthdate, sex, race/ethnicity, and place of residence). Second, these data can be linked to other CMS data sets on health care utilization, insurance enrollment, and clinician characteristics. Third, the data cover nearly 70% of adults aged 65 years and older, making fee-for-service Medicare data a rich source of utilization and outcomes data and allowing for subgroup analyses without decreased statistical power. Fourth, the data can be linked to non-CMS data, such as the US Census, cancer registries (eg, the Surveillance, Epidemiology, and End Results Program; Medicare), other government insurance programs (eg, Medicaid), the Social Security death index, and clinician information (ie, American Hospital Association data). Fifth, patients can be tracked across episodes of care, which permits longitudinal evaluations of outcomes and health care utilization. Finally, Medicare data files are a cost-effective way to assess a large patient population across multiple health care settings.

However, there are important limitations when using Medicare data for ...

Pop-up div Successfully Displayed

This div only appears when the trigger link is hovered over. Otherwise it is hidden from view.