Task 1: CMS Medicare data content

The linked Medicare data consist of several types of files: Denominator, Summary, Claims, and Other files.  For more information regarding the structure of CMS data files, refer to Course 1, Module 1.  

Diagram: NHANES-Medicare Linked Data Structure

Diagram of NHANES-Medicare Linked Data Structure
View text version of Diagram


File descriptions and contents

The descriptions are intended to provide more detail on the contents in each file and the time period being covered, and what each record represents.  Links to file data dictionaries are located in the Resources section.

The Centers for Medicare & Medicaid Services (CMS) linked files are organized at the person-level, claim-level, or stay-level (i.e., for the inpatient and skilled nursing facilities [SNF] Medicare Provider Analysis and Review [MedPAR] files). A claim is considered to be a request for reimbursement that providers submit to insurance companies for services rendered. It includes the description of services and diagnoses. A ‘stay’ represents all services provided to a beneficiary from the date of admission to a facility through the discharge date. Some researchers are interested in totaling charges or payments up to various levels of aggregations, such as single stays or episodes of care that could continue over many months or an entire year.

Denominator Files

Denominator File

The Denominator File provides data on all Medicare beneficiaries enrolled and/or entitled to Medicare benefits in a given calendar year. CMS updates the Denominator File with information collected through the first three months of the following calendar year. This file is a census of the Medicare population. The geographical scope is national. 

Information contained in this file includes date of death, monthly Medicare entitlement indicators, reasons for entitlement, state buy-in indicators, and monthly information on the enrollment status of linked Medicare beneficiaries including third party payer information and Medicare Part C plan enrollment. Medicare Part C plans are also referred to as Medicare Advantage (MA) and include Health Maintenance Organizations (HMOs), Preferred Provider Organizations (PPOs), Private Fee-for-Service (PFFS) Plans, Special Needs Plans, and Medicare Medical Savings Account Plans.

Each Medicare beneficiary is represented on the file by the NHANES public identifier SEQN. The Denominator File can and should be combined with other Medicare data sources (e.g., Standard Analytic File/Claims data) using SEQN. This public identifier may also be used to combine files so that you can examine patterns of enrollment and service usage over multiple years. 

Info icon Information

You should always include a request for the Denominator File as part of the data request.

Part D Denominator File

The Medicare Part D Denominator File contains demographic and enrollment data for each beneficiary enrolled in Medicare during the calendar year, regardless of whether the beneficiary elected to obtain Part D coverage. In addition to the variables available on the standard Medicare Denominator file, the Part D Denominator File contains monthly indicators for Medicare Advantage Prescription Drug plans (MA-PD) and stand-alone prescription drug plan (PDP) enrollment, Low Income Subsidy (LIS) enrollment, Retiree Drug Subsidy, State Reported Dual Eligibility Status, and an indicator for Other Credible Drug Coverage.

Info icon Information

If you intend to analyze Part D event data, a request for the Part D Denominator File should always be included as part of the data request. Many Medicare beneficiaries do not purchase the Part D benefit, therefore you should carefully evaluate the population to whom you want to make inferences if using Part D Event data files.

Summary Files

Chronic Condition Summary File

The Chronic Condition Summary File contains summarized information regarding the presence of 21 chronic conditions for all beneficiaries.  This file includes three types of flags for each of 21 chronic conditions that capture the time frame of the condition:

  1. Yearly – which indicates whether each of the 21 chronic condition definitions was met during the time period ending December 31 of the year,
  2. Mid-year – which may be useful if you are using a July 1 time frame (the time period ending on June 30 of the year), and
  3. Ever – which indicates the date the beneficiary was first identified as having met the specifications for the condition (note: 1999 is the earliest year that will appear in this field).

The chronic condition fields are defined by looking at a pattern of medical care utilization, as determined by Medicare FFS claims.  The conditions range from Alzheimer’s disease and cancer, to chronic kidney disease, diabetes and stroke.   For a complete list of 21 chronic conditions and more information about the creation of the Chronic Condition Summary File, see the Resources section.

Summary Medicare Enrollment and Claims File

The Summary Medicare Enrollment and Claims (SMEC) files were created by NCHS to assist you in analyzing Medicare cost and claims data from multiple Medicare service files. SMEC files are available for the NHANES-CMS Medicare linked data for each year of Medicare data. The SMEC files contain data on the beneficiary’s reason for Medicare entitlement, total number of months of Medicare entitlement, Medicare Part C enrollment, and summarized Medicare service charges and reimbursement amounts. These summarized (or summary) variables are modeled after the Medicare Current Beneficiary Survey (MCBS) cost and use files.

Home Health Agency Claim File

The Home Health Agency (HHA) file contains claims for Part A home health services; information includes the number of visits, type of visit (e.g., physical therapy), diagnosis (International Statistical Classification of Diseases and Related Health Problems [ICD]-9 diagnosis), date of visits, reimbursement amount, and beneficiary demographic information.

Hospice Claim File

The Hospice File contains claims data submitted by hospice providers. Data include the level of hospice care received (e.g., routine home care, inpatient respite care), terminal diagnosis (ICD-9 diagnosis), the dates of service, reimbursement amount, and beneficiary demographic information.  A hospice claim may contain a series of visits, all of which are part of an episode of care. Note that an episode of care may span multiple claims.

Outpatient Claim File

The Outpatient File contains Medicare Part A claims from institutional outpatient providers. Information contained in this file includes diagnosis and procedure (ICD-9 diagnosis, ICD-9 procedure code, and Healthcare Common Procedure Coding System [HCPCS] codes), dates of service, reimbursement amount, revenue center codes and beneficiary demographic information.

Medicare Provider and Analysis Review Files

The Medicare Provider and Analysis Review (MedPAR) files were specifically developed by CMS for researchers interested in studying inpatient hospital or SNF care. CMS has combined the claims information relevant to a stay, and created a single, fixed-length record for each hospital stay, or in a separate file, each SNF stay.

Generally, each MedPAR record represents one claim, but may include multiple claims depending on the length of a beneficiary's stay and the volume of services (e.g., intensive care stays) used throughout the stay.  

Information contained in this file includes the diagnosis (ICD-9 diagnosis), procedure (ICD-9 procedure code), dates of service, reimbursement amount, hospital provider or SNF provider, and beneficiary demographic information. The MedPAR file does not contain HCPCS procedure codes.

MedPAR Inpatient Hospital Stay

The MedPAR Inpatient Hospital Stay file contains data from claims for services provided to Medicare beneficiaries admitted to Medicare-certified hospitals. Each MedPAR record represents a beneficiary stay in an inpatient hospital (where discharged) and may include one claim or multiple claims. Approximately 95% of inpatient MedPAR records involve a single claim.

Only inpatient records with discharge dates during the calendar year are included in MedPAR. The year to which a stay belongs is based on the discharge date.

MedPAR Skilled Nursing Facility Stay

The MedPAR Skilled Nursing Facility (SNF) File contains summarized data from claims for services provided to Medicare beneficiaries admitted to Medicare-certified SNFs. Each MedPAR record represents a beneficiary stay in a SNF (may be 'still a patient'; complete discharge data is not always received), and may include one claim or multiple claims. Approximately 50% of SNF MedPAR records involve a single claim. The admission date is used for inclusion in the yearly file. Patient stays in SNFs tend to be longer than in other types of facilities.

Claim Files

Carrier Claim File

The Carrier file (formerly the Physician/Supplier Part B file) contains claims data submitted by non-institutional providers.

The data files contain claim information from Medicare-eligible physicians’ and surgeons’ services and procedures, certain home health services, ambulance, laboratory, tests – such as imaging, certain chiropractic services, and Part B.

The Carrier file has a high volume of claims which are formatted as variable length records. All records consist of a fixed length portion for the main or “base” portion of the claim (e.g., patient provider characteristics and diagnosis). There is also a variable length portion of the record, with varying length depending on the number of line-items or procedures billed during the visit.

Durable Medical Equipment Claim File

The Durable Medical Equipment (DME) Claim File contains claims data submitted by DME regional carriers to CMS. Information includes diagnosis, service type codes, dates of service, reimbursement amount, and beneficiary demographic information.

Part D Drug Event File

Prescription drug event (also abbreviated as PDE) data files contain claim-like information on prescription drugs submitted by pharmacies to the Part D health plans for beneficiaries enrolled in Medicare Part D. PDE data begin with the inception of Medicare Part D benefit in 2006. The Part D data reflect filled prescriptions only.  The PDE data are created from point-of-service transactional data at the time a prescription is filled. Data for prescriptions which are ordered but not filled do not exist in this database (i.e., data are not prescribing data, but rather reflect filled prescriptions). The PDE data are considered “final action”, as they represent the final status of a drug claim at the time of CMS’ payment reconciliation process (i.e., the records account for post-transaction adjustments).

Not all Medicare-enrolled beneficiaries elect to purchase Part D coverage. PDE data are not submitted by plans which receive Retiree Drug Subsidy, or for other types of plans which are considered to be Part D creditable coverage (e.g., VA, TRICARE, FEHBP). There can be multiple records per person. These files include information such as the specific drug using National Drug Code (NDC), quantity, cost, and prescription drug benefit information.

Other Files

End Stage Renal Disease Files

The End Stage Renal Disease (ESRD) data files can be used by researchers interested in conducting analysis specifically related to patients with ESRD.

Variable lists

A full list of the variables contained in each file is available on the NCHS website and can be found in the Resources section.

information icon Information

Although the Denominator File contains gender, race, and date of birth variables, NCHS recommends that you use gender, race-ethnicity and date of birth provided in NHANES.

Common variables

In addition to the NHANES participant public identifier (variable name: SEQN), there are several other common types of variables which appear in the different Medicare data files. Examples include diagnoses associated with the visit, dates of service, provider type, charges and costs, and reasons for claim nonpayment. There is also some demographic information on all files.

warning icon Warning

The Medicare data come from administrative records and there may be some inconsistencies because Medicare data are collected for administrative, not research purposes.




close window icon Close Window to return to module page.