The Proposal Process
Creating the Data Dictionary
Updating your Proposal
Proposal Format - PDF Version [PDF - 105 KB]
Word Version [DOC - 36 KB]
Sample Proposal [PDF - 393 KB]
The RDC has worked with many students over the years. We can provide some help and expertise with the data you have chosen, however, we cannot replace the advisor in guiding the research process. The following agreement summarizes the RDC proposal and disclosure avoidance process. All students and their advisors should submit a signed agreement along with their proposals.
RDC-Student-Advisor Agreement [PDF - 71 KB]
Creating the Data Dictionary:
The data dictionary is an essential part of the RDC proposal. During the proposal process, it is used to assess the disclosure risk of the project. Once the proposal is approved, it is used to create your dataset.
There are three parts to the data dictionary:
- Public Data – Please select only the variables from the public data that are necessary to answer your research question. We will not merge the entire public file to restricted data.
- Restricted Data – Many of the restricted variables are listed on our Restricted Data page, however these lists are not exhaustive. Reviewing the survey documentation (i.e. questionnaire) will help you determine what, in addition, may be available.
- Non-NCHS Data – If you wish to have variables added from another data source, please provide a list of those variables. Do not exceed 100 variables.
- List or provide in a table format the following information:
- File the data is coming from (e.g. NHIS 2000 person file)
- Variable Names
- Variable Descriptions
- The original NCHS variables must retain the name they are given in the public dataset
- Highlight the variables in each dictionary that will be used in the merge. It is important that these variables be formatted consistently between data sets for the merge to go smoothly and
Data Dictionary Examples
Because all data systems are slightly different, the data dictionary may come in a variety of styles. Please see the various examples for the data systems.
- For NHIS, the PUBLICID is actually a compound variable that can be used to link files of different levels household, family, and person. Be sure to retain the component variables, and follow the variable names, formats and lengths specified in the documentation for that data system when you are creating your public use subset files.
- NHDS users: Instead of providing a public and restricted data dictionary, please provide one data dictionary that chooses the variables necessary for your research from the NHDS Restricted Variables Codebook .
We understand that research evolves and may change from the day you submit the proposal to the end of your analysis. However, it is important that your RDC Analyst be made aware of changes throughout the process as these changes may affect the disclosure risk. It is also important that the research proposal be updated to represent these changes.
Change in Research Team:
- The proposal must be updated and the new person must fulfill confidentiality requirements.
- Adding variables may change the disclosure risk of the proposal
- Adding variables may result in additional costs
New methods or types of output:
- If the analytic plan changes significantly from the method stated in the approved proposal, please discuss the changes with your RDC Analyst before conducting the analysis. Changes will need to be documented in the proposal.
Proposal Amendments require approval; typically approvals are given within two weeks. However, a full review of 6-8 weeks may be necessary for more complex amendments.
When submitting an amendment:
- Indicate the Amendment Date 00/00/0000 on the first page.
- Explain the changes and why they are necessary throughout the relevant sections of the proposal.
- Highlight or “track” all changes.
- Email directly to your RDC Analyst include a summary of the changes.
- Page last reviewed: February 26, 2015
- Page last updated: February 26, 2015
- Content source: