Providing the Public Use Data
Researchers are responsible for providing the NCHS public dataset as well as any non-NCHS data. Compiling the public use dataset provides you the opportunity to become familiar with the data and expedite the data creation process.
- Exception: For NHDS, NAMCS, NHAMCS, and other DHHS data hosted by the RDC, you do not need to provide a public dataset. Your RDC Analyst will provide an extract from the restricted files that includes all of the variables specified in your proposal.
- Non-NCHS Data includes any data collected by the researcher, another government agency, or a private institution that the researcher wishes to merge with NCHS data, often using geographic codes. Examples for policy research have included air pollution data, proximity of fast food restaurants, or location of health care providers.
- Create a public data set that includes only the variables specified in your proposal.
- Original NCHS variables must retain the name they are given in the public data set. If you would like to rename these variables, include the original variable name in the variable description.
- If you choose to create derived variables prior to working with the data onsite or via remote access, make sure these variables are clearly defined. The variable description should include the original variable name(s) from which it was derived and any arithmetic manipulation must be explained. Please save the code you used to create these variables as your RDC Analyst may request it.
- If you are also sending another source of data, for example Census data, this data set should only include the variables specified in your proposal.
- Discuss with your RDC Analyst the preferred format for any merge variables. This is especially important for complex merges that involve multiple data sets and multiple merge variables. Create the variables as your RDC Analyst requests. This helps expedite the merge and improves data quality.
- If your RDC Analyst is deriving any variables for you from the restricted data, discuss how you would like them created in advance. For example, if you have requested a season variable to be created, you will need to provide a definition of season. You can also send the code you would like used to create the new variables.
- Email the data files along with a list of the variables to your RDC Analyst. If your data files are too large to be emailed, please discuss other options with your RDC Analyst.
Important Notes about Submitting Public Data
- Any attempt to include variables that may lead to re-identification of subjects/establishments is considered a disclosure violation and will result in the cessation of your project and possible legal actions.
- If you are requesting access to the restricted Mortality files, you cannot include any public use mortality variables, or variables derived from the public use mortality data.
- Do not include variables that are not listed in your approved proposal without first updating your proposal and discussing the matter with your RDC Analyst. Additional variables will require additional review.
Page last reviewed: March 15, 2012
Content source: CDC/National Center for Health Statistics