Task 2: How to Append NHANES Dietary Data
Here are the steps to appending NHANES data:
Step 1: Compare Variable Names and Labels
The first step
before appending data is to examine the contents of the data files. Using the
PROC CONTENTS procedure, you can get a list of variable names and variable
labels for each data file selected. While reviewing the output of the PROC
CONTENTS procedure, you should compare variable names and labels to see whether
any changes or differences occurred from cycle to cycle.
The example below
uses the sample "Food Sources" program. Notice that the variable labels for
“Calcium (mg)” are the same between 2001-2002 and 2003-2004, but the variable
names are different. Additionally, a comparison of the documentation
for vitamins A and E between 2001-2002 and 2003-2004 shows that although the
variable names remain the same, the units of measure are different, and only careful examination of the documentation
would allow you to detect this change. It is important to check whether the
variable names and labels are consistent between datasets before appending.
Program to Check Datasets' Contents and Compare Variable Names and Labels
|
*-------------------------------------------------------------------------;
* Use the LIBNAME statement to refer to the folder where the data
files ;
* are stored.
;
*
;
* Use the PROC CONTENTS
procedure to list the contents of each dataset ;
* 2001-2002 Dietary Interview
(Individual Foods File) Examination File ;
* 2003-3004 Dietary Interview (Individual Foods File) Examination
File ;
* 2001-2002 Demographic
File ;
* 2003-2004 Demographic File
;
*
;
* Use the VARNUM option to list
the variables according to their ;
*
position
in the dataset.
;
*-------------------------------------------------------------------------;
libname
NH "C:\NHANES\DATA";
proc
contents
data=NH.DRXIFF_B
varnum;
proc
contents
data=NH.DR1IFF_C
varnum;
proc
contents
data=NH.DEMO_B
varnum;
proc
contents
data=NH.DEMO_C
varnum;
run; |
Output of Program
Click here to view program output and highlights
Contents of DRXIFF_B
|
The SAS
System
The CONTENTS Procedure
Data Set Name NH.DRXIFF_B
Observations 143004
Member Type DATA
Variables 74
Engine V9
Indexes 0
Created Friday, March 28, Observation
Length 592
2008 08:39:36 AM
Last Modified Friday, March 28, Deleted
Observations 0
2008 08:39:36 AM
Protection
Compressed NO
Data Set Type
Sorted NO
Engine/Host
Dependent Information
Data Set Page Size 16384
Number of Data Set Pages 5298
First Data Page 1
Max Obs per Page 27
Obs in First Data Page 11
Number of Data Set Repairs 0
Variables in Creation Order
# Variable Type Len Format Informat Label
1 SEQN Num 8 BEST12. F12. Respondent sequence number
2 DRXILINE Num 8 Food/individual component
number
3 WTDRD1 Num 8 Dietary day one sample weight
4 DRDDRSTZ Num 8 Dietary recall status
5 DRDDAY Num 8 Intake day of week
6 DRALANG Num 8 Language SP/Proxy used mostly
7 DRXCCMNM Num 8 Combination food number
8 DRDCCMTZ Num 8 Combination food type
9 DRD020 Num 8 HHMM5. Time of eating occasion (HH:MM)
10 DRD030Z Num 8 Name of eating occasion
11 DRD040Z Num 8 Was this food eaten at home?
12 DRDIFDCD Num 8 USDA food code
13 DRXIGRMS Num 8 Grams
14 DRXIKCAL Num 8 Energy (kcal)
15 DRXIPROT Num 8 Protein (gm)
16 DRXICARB Num 8 Carbohydrate (gm)
17 DRXISUGR Num 8 Total sugars (gm)
18 DRXIFIBE Num 8 Dietary fiber (gm)
19 DRXITFAT Num 8 Total fat (gm)
20 DRXISFAT Num 8 Total saturated fatty acids
(gm)
21 DRXIMFAT Num 8 Total monounsaturated fatty
acids (gm)
22 DRXIPFAT Num 8 Total polyunsaturated fatty
acids (gm)
23 DRXICHOL Num 8 Cholesterol (mg)
24 DRXIATOC Num 8 Vitamin E as alpha-tocopherol
(mg)
25 DRXIRET Num 8 Retinol (mcg)
26 DRXIVARA Num 8 Vitamin A, RAE (mcg)
27 DRXIACAR Num 8 Alpha-carotene (mcg)
28 DRXIBCAR Num 8 Beta-carotene (mcg)
29 DRXICRYP Num 8 Beta-cryptoxanthin (mcg)
30 DRXILYCO Num 8 Lycopene (mcg)
31 DRXILZ Num 8 Lutein + zeaxanthin (mcg)
32 DRXIVB1 Num 8 Thiamin (Vitamin B1) (mg)
33 DRXIVB2 Num 8 Riboflavin (Vitamin B2) (mg)
34 DRXINIAC Num 8 Niacin (mg)
35 DRXIVB6 Num 8 Vitamin B6 (mg)
36 DRXIFOLA Num 8 Total Folate (mcg)
37 DRXIFA Num 8 Folic acid (mcg)
38 DRXIFF Num 8 Food folate (mcg)
39 DRXIFDFE Num 8 Folate, DFE (mcg)
40 DRXIVB12 Num 8 Vitamin B12 (mcg)
41 DRXIVC Num 8 Vitamin C (mg)
42 DRXIVK Num 8 Vitamin K (mcg)
43 DRXICALC Num 8 Calcium (mg)
44 DRXIPHOS Num 8 Phosphorus (mg)
45 DRXIMAGN Num 8 Magnesium (mg)
46 DRXIIRON Num 8 Iron (mg)
47
DRXIZINC Num 8 Zinc (mg)
48 DRXICOPP Num 8 Copper (mg)
49 DRDISODI Num 8 Sodium (mg)
50 DRXIPOTA Num 8 Potassium (mg)
51 DRXISELE Num 8 Selenium (mcg)
52 DRXICAFF Num 8 Caffeine (mg)
53 DRXITHEO Num 8 Theobromine (mg)
54 DRXIALCO Num 8 Alcohol (gm)
55 DRXIMOIS Num 8 Moisture (gm)
56 DRXIS040 Num 8 SFA 4:0 (Butanoic) (gm)
57 DRXIS060 Num 8 SFA 6:0 (Hexanoic) (gm)
58 DRXIS080 Num 8 SFA 8:0 (Octanoic) (gm)
59 DRXIS100 Num 8 SFA 10:0 (Decanoic) (gm)
60 DRXIS120 Num 8 SFA 12:0 (Dodecanoic) (gm)
61 DRXIS140 Num 8 SFA 14:0 (Tetradecanoic) (gm)
62 DRXIS160 Num 8 SFA 16:0 (Hexadecanoic) (gm)
63 DRXIS180 Num 8 SFA 18:0 (Octadecanoic) (gm)
64 DRXIM161 Num 8 MFA 16:1 (Hexadecenoic) (gm)
65 DRXIM181 Num 8 MFA 18:1 (Octadecenoic) (gm)
66 DRXIM201 Num 8 MFA 20:1 (Eicosenoic) (gm)
67 DRXIM221 Num 8 MFA 22:1 (Docosenoic) (gm)
68 DRXIP182 Num 8 PFA 18:2 (Octadecadienoic)
(gm)
69 DRXIP183 Num 8 PFA 18:3 (Octadecatrienoic)
(gm)
70 DRXIP184 Num 8 PFA 18:4 (Octadecatetraenoic)
(gm)
71 DRXIP204 Num 8 PFA 20:4 (Eicosatetraenoic)
(gm)
72 DRXIP205 Num 8 PFA 20:5 (Eicsapentaenoic)
(gm)
73 DRXIP225 Num 8 PFA 22:5 (Docosapentaenoic)
(gm)
74 DRXIP226 Num 8 PFA 22:6 (Docosahexaenoic)
(gm)
|
Contents of DR1IFF_C
|
The SAS
System
The CONTENTS Procedure
Data Set Name NH.DR1IFF_C
Observations 131164
Member Type DATA
Variables 82
Engine V9
Indexes 0
Created Friday, March 28, Observation
Length 656
2008 08:39:58 AM
Last Modified Friday, March 28, Deleted
Observations 0
2008 08:39:58 AM
Protection
Compressed NO
Data Set Type
Sorted NO
Engine/Host Dependent
Information
Data Set Page Size 16384
Number of Data Set Pages 5466
First Data Page 1
Max Obs per Page 24
Obs in First Data Page 8
Number of Data Set Repairs 0
Variables in Creation Order
# Variable Type Len Format Label
1 SEQN Num 8 Respondent sequence number
2 DR1ILINE Num 8 Food/individual component
number
3 WTDRD1 Num 8 Dietary day one sample weight
4 WTDR2D Num 8 Dietary two-day sample weight
5 DR1DRSTZ Num 8 Dietary recall status
6 DR1EXMER Num 8 Interviewer ID code
7 DRABF Num 8 Breast-fed infant (either day)
8 DRDINT Num 8 Number of days of intake
9 DR1DAY Num 8 Intake day of week
10 DR1LANG Num 8 Language SP/Proxy used mostly
11 DR1CCMNM Num 8 Combination food number
12 DR1CCMTX Num 8 Combination food type
13 DR1_020 Num 8 HHMM6. Time of eating occasion (HH:MM)
14 DR1_030Z Num 8 Name of eating occasion
15 DR1FS Num 8 Source of food
16 DR1_040Z Num 8 Was this food eaten at home?
17 DR1IFDCD Num 8 USDA food code
18 DR1MC Num 8 Modification code
19 DR1IGRMS Num 8 Grams
20 DR1IKCAL Num 8 Energy (kcal)
21 DR1IPROT Num 8 Protein (gm)
22 DR1ICARB Num 8 Carbohydrate (gm)
23 DR1ISUGR Num 8 Total sugars (gm)
24 DR1IFIBE Num 8 Dietary fiber (gm)
25 DR1ITFAT Num 8 Total fat (gm)
26 DR1ISFAT Num 8 Total saturated fatty acids
(gm)
27 DR1IMFAT Num 8 Total monounsaturated fatty
acids (gm)
28 DR1IPFAT Num 8 Total polyunsaturated fatty
acids (gm)
29 DR1ICHOL Num 8 Cholesterol (mg)
30 DR1IATOC Num 8 Vitamin E as alpha-tocopherol
(mg)
31 DR1IATOA Num 8 Added alpha-tocopherol
(Vitamin E) (mg)
32 DR1IRET Num 8 Retinol (mcg)
33 DR1IVARA Num 8 Vitamin A, RAE (mcg)
34 DR1IACAR Num 8 Alpha-carotene (mcg)
35 DR1IBCAR Num 8 Beta-carotene (mcg)
36 DR1ICRYP Num 8 Beta-cryptoxanthin (mcg)
37 DR1ILYCO Num 8 Lycopene (mcg)
38 DR1ILZ Num 8 Lutein + zeaxanthin (mcg)
39 DR1IVB1 Num 8 Thiamin (Vitamin B1) (mg)
40 DR1IVB2 Num 8 Riboflavin (Vitamin B2) (mg)
41 DR1INIAC Num 8 Niacin (mg)
42 DR1IVB6 Num 8 Vitamin B6 (mg)
43 DR1IFOLA Num 8 Total Folate (mcg)
44 DR1IFA Num 8 Folic acid (mcg)
45 DR1IFF Num 8 Food folate (mcg)
46 DR1IFDFE Num 8 Folate, DFE (mcg)
47 DR1IVB12 Num 8 Vitamin B12 (mcg)
48 DR1IB12A Num 8 Added vitamin B12 (mcg)
49 DR1IVC Num 8 Vitamin C (mg)
50 DR1IVK Num 8 Vitamin K (mcg)
51 DR1ICALC Num 8 Calcium (mg)
52 DR1IPHOS Num 8 Phosphorus (mg)
53 DR1IMAGN Num 8 Magnesium (mg)
54 DR1IIRON Num 8 Iron (mg)
55 DR1IZINC Num 8 Zinc (mg)
56 DR1ICOPP Num 8 Copper (mg)
57 DR1ISODI Num 8 Sodium (mg)
58 DR1IPOTA Num 8 Potassium (mg)
59 DR1ISELE Num 8 Selenium (mcg)
60 DR1ICAFF Num 8 Caffeine (mg)
61 DR1ITHEO Num 8 Theobromine (mg)
62 DR1IALCO Num 8 Alcohol (gm)
63 DR1IMOIS Num 8 Moisture (gm)
64 DR1IS040 Num 8 SFA 4:0 (Butanoic) (gm)
65 DR1IS060 Num 8 SFA 6:0 (Hexanoic) (gm)
66 DR1IS080 Num 8 SFA 8:0 (Octanoic) (gm)
67 DR1IS100 Num 8 SFA 10:0 (Decanoic) (gm)
68 DR1IS120 Num 8 SFA 12:0 (Dodecanoic) (gm)
69 DR1IS140 Num 8 SFA 14:0 (Tetradecanoic) (gm)
70 DR1IS160 Num 8 SFA 16:0 (Hexadecanoic) (gm)
71 DR1IS180 Num 8 SFA 18:0 (Octadecanoic) (gm)
72 DR1IM161 Num 8 MFA 16:1 (Hexadecenoic) (gm)
73 DR1IM181 Num 8 MFA 18:1 (Octadecenoic) (gm)
74 DR1IM201 Num 8 MFA 20:1 (Eicosenoic) (gm)
75 DR1IM221 Num 8 MFA 22:1 (Docosenoic) (gm)
76 DR1IP182 Num 8 PFA 18:2 (Octadecadienoic)
(gm)
77 DR1IP183 Num 8 PFA 18:3 (Octadecatrienoic)
(gm)
78 DR1IP184 Num 8 PFA 18:4 (Octadecatetraenoic)
(gm)
79 DR1IP204 Num 8 PFA 20:4 (Eicosatetraenoic)
(gm)
80 DR1IP205 Num 8 PFA 20:5 (Eicosapentaenoic)
(gm)
81 DR1IP225 Num 8 PFA 22:5 (Docosapentaenoic)
(gm)
82 DR1IP226 Num 8 PFA 22:6 (Docosahexaenoic)
(gm) |
Contents of DEMO_B
|
The SAS
System
The CONTENTS
Procedure
Data Set Name NH.DEMO_B
Observations 11039
Member Type DATA
Variables 24
Engine V9
Indexes 0
Created Friday, March 28, Observation
Length 192
2008 08:39:35 AM
Last Modified Friday, March 28, Deleted
Observations 0
2008 08:39:35 AM
Protection
Compressed NO
Data Set Type
Sorted NO
Engine/Host Dependent
Information
Data Set Page Size 16384
Number of Data Set Pages 131
First Data Page 1
Max Obs per Page 85
Obs in First Data Page 64
Number of Data Set Repairs 0
Variables in Creation Order
# Variable Type Len Label
1 SEQN Num 8 Respondent sequence number
2 SDDSRVYR Num 8 Data Release Number
3 RIDSTATR Num 8 Interview/Examination Status
4 RIAGENDR Num 8 Gender - Adjudicated
5 RIDAGEYR Num 8 Age at Screening Adjudicated -
Recode
6 RIDAGEMN Num 8 Age in Months - Recode
7 RIDAGEEX Num 8 Exam Age in Months - Recode
8 RIDRETH1 Num 8 Race/Ethnicity - Recode
9 RIDRETH2 Num 8 Linked NH3 Race/Ethnicity -
Recode
10 DMQMILIT Num 8 Served in the US Armed Forces
11 DMDBORN Num 8 Country of Birth - Recode
12 DMDEDUC Num 8 Education - Recode
13 INDHHINC Num 8 Annual Household Income
14 INDFMINC Num 8 Annual CPS Family Income
15 INDFMPIR Num 8 CPS Family PIR
16 DMDMARTL Num 8 Marital Status
17 RIDEXPRG Num 8 Pregnancy Status at Exam -
Recode
18 RIDPREG Num 8 Pregnancy Status - Recode (old
version)
19 WTINT2YR Num 8 Full Sample 2 Year Interview
Weight
20 WTINT4YR Num 8 Full Sample 4 Year Interview
Weight
21 WTMEC2YR Num 8 Full Sample 2 Year MEC Exam
Weight
22 WTMEC4YR Num 8 Full Sample 4 Year MEC Exam
Weight
23 SDMVPSU Num 8 Masked Variance Pseudo-PSU
24 SDMVSTRA Num 8 Masked Variance Pseudo-Stratum |
Contents of DEMO_C
|
The SAS
System
The CONTENTS Procedure
Data Set Name NH.DEMO_C
Observations 10122
Member Type DATA
Variables 31
Engine V9
Indexes 0
Created Friday, March 28, Observation
Length 248
2008 08:39:36 AM
Last Modified Friday, March 28, Deleted
Observations 0
2008 08:39:36 AM
Protection
Compressed NO
Data Set Type
Sorted NO
Engine/Host Dependent
Information
Data Set Page Size 16384
Number of Data Set Pages 157
First Data Page 1
Max Obs per Page 65
Obs in First Data Page 46
Number of Data Set Repairs 0
Variables in Creation Order
# Variable Type Len Label
1 SEQN Num 8 Respondent sequence number
2 SDDSRVYR Num 8 Data Release Number
3 RIDSTATR Num 8 Interview/Examination Status
4 RIAGENDR Num 8 Gender - Adjudicated
5 RIDAGEYR Num 8 Age at Screening Adjudicated -
Recode
6 RIDAGEMN Num 8 Age in Months - Recode
7 RIDAGEEX Num 8 Exam Age in Months - Recode
8 RIDRETH1 Num 8 Race/Ethnicity - Recode
9 RIDRETH2 Num 8 Linked NH3 Race/Ethnicity -
Recode
10 DMQMILIT Num 8 Served in the US Armed Forces
11 DMDBORN Num 8 Country of Birth - Recode
12 DMDEDUC Num 8 Education - Recode
13 INDHHINC Num 8 Annual Household Income
14 INDFMINC Num 8 Annual CPS Family Income
15 INDFMPIR Num 8 CPS Family PIR
16 DMDMARTL Num 8 Marital status
17 RIDEXPRG Num 8 Pregnancy Status at Exam -
Recode
18 SIALANG Num 8 Language of SP Interview
19 SIAPROXY Num 8 Proxy used in SP Interview?
20 SIAINTRP Num 8 Interpreter used in SP
Interview?
21 FIALANG Num 8 Language of Family Interview
22 FIAPROXY Num 8 Proxy used in Family Interview?
23 FIAINTRP Num 8 Interpreter used in Family
Interview?
24 MIALANG Num 8 Language of MEC Interview
25 MIAPROXY Num 8 Proxy used in MEC Interview?
26 MIAINTRP Num 8 Interpreter used in MEC
Interview?
27 AIALANG Num 8 Language of ACASI Interview
28 WTINT2YR Num 8 Full Sample 2 Year Interview
Weight
29 WTMEC2YR Num 8 Full Sample 2 Year MEC Exam
Weight
30 SDMVPSU Num 8 Masked Variance Pseudo-PSU
31 SDMVSTRA Num 8 Masked Variance Pseudo-Stratum |
What you will see when you review the output from this demonstration:
-
A
comparison of the outputs from the 2001-2002 and 2003-2004 demographics
files shows identical lists of variable names and labels for the
variables of interest.
These files can be directly appended.
-
A
comparison of the 2001-2002 and 2003-2004 individual foods files shows, for
example, that the variable labels for “Calcium (mg)” are the same, but the
variable names are different (DRXICALC in 2001-2002, and DR1ICALC in
2003-2004). After consulting the code book and frequency table, you will
find their response categories are identical. You can now simply rename one
of them to make them identical. If two variables have different response
categories, definitions or wordings, you may need to recode first before
renaming the variable.
-
Notice
also that that the variable labels for “Dietary Recall Status” are the same,
but the variable names are different (DRDDRSTZ in 2001-2002, and DR1DRSTZ in
2003-2004). One of these variables will have to be renamed in order to
append the data.
Click here to close program output and highlights
Most dietary variables from 2001-2002 begin with the prefix DRXT
and most dietary variables from the 2003-2004 begin with the
prefix DR1T (for Day 1 data) or DR2T (for Day 2 data). Because
these variables are continuous (as opposed to categorical), you can simply
rename them to make them identical. |
Step 2: Append Directly, If Variables are Identical
After carefully reviewing the demographic files, you will
find that the variables of interest in the two cycles remain the same.
Therefore, you can directly append without any further changes.
Because you are interested only in a subset of the
variables, you can use the KEEP option statement to select relevant variables.
When appending NHANES data you should always include the
sequence number (SEQN). Failing to do so will lead to
problems if you want to sort or merge your data files at a later
time. |
As a reminder, the sample code below is taken from the "Food Sources" program. No output is associated with this procedure, so you will
need to check the SAS log file to make sure that the procedure was completed
successfully. Additionally, you can use SAS Explorer to see that the new 4-year
dataset (DEMO_4YR) is in your WORK library, which is the default temporary
library created for each SAS session. This library is deleted when the SAS
session is complete. (To find out how to save the dataset to a SAS-accessible
library, see the
Save a Dataset module.)
Program to Directly Append Datasets
|
*-------------------------------------------------------------------------;
* The DATA step creates a dataset for your 4 years of demographic
data ;
* (DEMO_4YR).
;
*
;
* The SET statement appends the 2003-2004 demographic data file
;
* (NH.DEMO_C) to the 2001-2002 demographic data file (NH.DEMO_B).
;
*
;
* The KEEP statement selects the variables of interest. Notice that
;
* in the keep statement, the variable, sequence number (SEQN) is
;
* included. This variable should be included when datasets are
appended. ;
*
;
* The SDMVPSU and SDMVSTRA variables are included in the dataset in
order ;
* to incorporate survey design information in later analyses.
;
*
;
* Note that WTMEC2YR is the weight variable for all persons examined
in ;
* the MEC and is appropriate for use with dietary recall data.
Weights ;
* must be used in order for your analysis to be generalizable to the
;
* total population. For more information on weighting, see the
Overview ;
* of NHANES Survey Design and Weights module in the NHANES Dietary
Data ;
* Survey Orientation Course.
;
*-------------------------------------------------------------------------;
data
DEMO_4YR;
set
NH.DEMO_B (keep=SEQN RIDAGEYR SDMVPSU SDMVSTRA)
NH.DEMO_C (keep=SEQN RIDAGEYR SDMVPSU SDMVSTRA);
run; |
Step 3: Rename Variables and/or Recode Variables Before Appending, If
Variables are Different
If the variables in your datasets differ, you will need to rename
and/or recode them before you append them. For example, the
2001-2002 total nutrient intake files contain variables that were
renamed in 2003-2004. Therefore, if you append files from these
survey cycles, you will need to rename the variable first and then
append the data. If the response categories of the variables are
different, you will also need to recode.
You will see in the sample code from the "Food Sources" program that the variables DRDDRSTZ,
DRXICALC, and DRDIFDCD in the 2001-2002 individual food file were renamed to
DR1DRSTZ, DR1ICALC, and DR1IFDCD, respectively, the same as the variable names
in the 2003-2004 data file. After renaming the 2001-2002 variables, you will be
ready to append the data files with selected
variables of interest.
Program to Rename Variables and Append
|
*-------------------------------------------------------------------------;
* The DATA step creates the dataset for your 4 years of dietary
data ;
*
(IFF_4YR).
;
*
;
* The KEEP statement includes only variables of interest in your
dataset. ;
*
;
* The SET statement appends the 2003-2004 dietary nutrient data
file ;
* (NH.DR1IFF_C) to the 2001-2002 dietary nutrient data file (NH.DRXIFF_B).;
*
;
* The RENAME statement renames the variables DRDDRSTZ, DRXICALC,
and ;
* DRDIFDCD in the 2001-2002 dietary nutrient data file to
DR1DRSTZ, ;
* DR1ICALC, and DR1IFDCD, which are the names given to the same
variables ;
* in the 2003-2004 dietary nutrient data
file. ;
*-------------------------------------------------------------------------;
data
IFF_4YR (keep=DR1IFDCD WTDRD1 DR1ICALC SEQN DR1DRSTZ);
set
NH.DRXIFF_B (rename=(DRDDRSTZ=DR1DRSTZ DRXICALC=DR1ICALC
DRDIFDCD=DR1IFDCD))
NH.DR1IFF_C;
run; |
No output is associated with this procedure, so you will
need to check the SAS log file to make sure that the procedure completed
successfully. Additionally, you can use SAS Explorer to see that the new 4-year
dataset (IFF_4YR) is in your WORK library.
Step 4: Construct Weights for
NHANES Analyses across Multiple Survey Cycles
In general, when combining multiple survey cycles, the basic sample weight
variable for each cycle should be divided by the number of cycles in the
combined data set. Then, these rescaled weights can be summed to form a new
weight for the combined survey cycles. The following examples show how to
construct weights for multiple survey cycles for NHANES 2001-2002 and beyond.
Combining 2001-2002 and 2003-2004 to Produce a 4-Year Dataset
For 4 years of data from 2001-2004,
construct a weight variable as follows:
|
if
SDDSRVYR=2
or SDDSRVYR=3 then MEC4YR =
WTMEC2YR/2; |
Combining 2001-2002, 2003-2004, and 2005-2006 to Obtain 6 Years of Data
For 6 years of
data from 2001-2006, construct a weight variable as follows:
|
if
SDDSRVYR in (2,3,4) then MEC6YR =
WTMEC2YR/3; |
Certain survey components were completed on subsamples, which have subsample sample weights. Subsample weights are not designed to be combined. In fact, many subsamples are mutually exclusive. If it is necessary to combine two or more subsamples for your analyses, then appropriate weights would need to be recalculated. However, details on how to recalculate weights when combining subsamples are beyond the scope of this tutorial. Therefore, it is strongly advised that you
do not attempt to combine subsamples in any analysis. |
Step 5: Check Results
After appending the data files, it is a good idea to check
the contents again to make sure that the files were appended correctly. Use the
PROC CONTENTS procedure, as demonstrated in Step 1, to check the combined files.
Consult the Program to Check Datasets' Contents and Compare Variable Names
and Labels, above, for further instruction, if necessary.
Double check variable names and labels, and make sure that
variables are renamed correctly. Pay special attention to the number of
observations in the combined dataset, which should be the sum of the
observations in the two data files.