Providing the Public Use and non-NCHS Data

Researchers are responsible for providing NCHS public-use data and any other non-NCHS data to their RDC Analyst. Compiling the public-use dataset provides you the opportunity to become familiar with the data and helps your RDC Analyst create your dataset. Please follow these steps when providing your RDC Analyst with the public-use data and/or non-NCHS data.

  1. Create a dataset that only includes the variables specified in your proposal. Do not include variables that are not listed in your approved proposal without first updating your proposal and discussing the matter with your RDC Analyst. Additional variables in your dataset that were not listed in your approved proposal will require additional review and may delay the start of your project.
  2. Original NCHS variable names must be use in your dataset so that variable names match those given in the NCHS public-use data set. If you would like to rename these variables, include the original variable name in the variable description.
  3. If you choose to create derived variables prior to working with the data onsite, make sure these variables are clearly defined. The variable description should include the original variable name(s) from which it was derived and any arithmetic code or algorithm used. Please save the code you used to create these variables as your RDC Analyst may request it.
  4. Discuss with your RDC Analyst the preferred file format for any merge files. This is especially important for complex merges that involve multiple data sets and multiple merge variables. Please work with your RDC Analyst to create these merge files as requested by your RDC Analyst. This helps expedite the merge process and improves data quality.
  5. Email your datasets along with a list of the variables to your RDC Analyst. If your datasets are too large to be emailed, please discuss other options (e.g., Dropbox, Google Drive or FTP) with your RDC Analyst.

 

Important Notes about Submitting Public Data

  • For NHDS, NAMCS, NHAMCS, NSDUH and other DHHS data hosted by the RDC, you do not need to provide a public-use dataset. Your RDC Analyst will provide an extract from the restricted-use files that includes all of the variables specified in your approved proposal.
  • Any attempt to include variables that may lead to re-identification of study participants or establishments is considered a disclosure violation and will result in the cancellation of your project and possible legal actions.
  • If you are requesting access to the restricted-use Mortality files, you cannot include any public-use mortality variables, or variables derived from the public-use mortality data.
  • Non-NCHS data includes data collected by the researcher, another government agency, or a private institution that the researcher wishes to merge with NCHS data.

Page last reviewed: April 25, 2019, 02:05 PM