Here are the steps for merging NHANES dietary data:
The first step in merging data is to sort each of the data files by a unique identifier. For example, if you wish to conduct an analysis with dietary recall data, sort the data by SEQN. If you wanted to conduct an analysis of supplement data, sort your data files by SEQN, supplement ID number, or ingredient ID depending on which supplement files you wish to merge to construct your database.
Use the PROC SORT procedure in SAS to sort the data.
After sorting the data files, you can continue merging the data using the MERGE statement. Remember that merging, as well as sorting, is done using a unique identifier (the SEQN or other identifier).
A second important thing to remember when merging is to consider the number of records per person in each of the data files you will be using. If you merge individual food or supplement files, remember that you will have multiple records per person or supplement because of the nature of the data files. If you merge total nutrient intake or other individual-level data files, you will have only one record per person. These different situations are demonstrated in the four examples below.
After you have merged the data files, it is advisable that you check the contents again to make sure that the files merged correctly. Use the PROC CONTENTS procedure to list all variable names and labels; use the PROC MEANS procedure to check the number of observations for each variable as well as missing, minimum, and maximum values.
When you check your results, in situations when you are merging datasets with one record per person with datasets with multiple records per person, you will find that the resulting number of records will be greater than the number of people in your sample. This is to be expected.
The examples provided in the links below demonstrate different scenarios that you may encounter when merging dietary data files. These scenarios use two of the tutorial’s three sample programs – Milk and Supplement – and show how to merge data depending on the number of records per person in the data file, as follows: )