Key Concepts About Merging Data in NHANES

For each two-year survey cycle, NHANES data files are organized into four types: Demographics, Examination, Laboratory, and Questionnaire files. All environmental chemical data files are in the category of Laboratory. The analysis of environmental chemical data must be conducted with the appropriate survey design weights and the basic demographic variables included in the Demographics data files. You will combine variables from these different data files into one dataset. This procedure is called merging, similar to adding columns in a table.

To merge data, variables from the different data files are linked with a unique identifier. SEQN stands for sequence number and is a unique identifier for each observation (participant) in NHANES. Whenever you conduct an analysis with individuals, SEQN is the variable you must use to merge data files.  

Before merging data, you need to sort each data file by the SEQN variable. This will ensure that all records are ordered in the same way in each data file. Use the PROC SORT procedure in SAS to sort the data. After sorting the data files, you can continue merging.

After you have merged the data files, it is advisable that you check the contents again to make sure that the files merged correctly.  Use the PROC CONTENTS statement to list all variables and their attributes. Use the PROC MEANS statement to check the number of observations, as well as missing, minimum, and maximum values, for each variable.


close window icon Close Window