Task 2: How to Identify and Recode Skip Patterns in NHANES Data
The second task is to check the data for skip patterns. To do this, you will:
Step 1: Check the Codebook for Skip Patterns
Check the codebook to determine whether a skip pattern affects the variables
in your analysis. See Task 1 of the
Locate Variables
module for more information on how to use the documentation to obtain background
information about your variables.
Skip Pattern in Osteoporosis Questionnaire
Codebook

Step 2: Check the Data for Skip Patterns
After you have used the codebook to determine whether any
of the variables in your analysis are part of a skip pattern, you will use the
PROC FREQ procedure in SAS to obtain cross-tabulation tables using the
variables that are part of the skip pattern. This will help confirm the
presence of the skip pattern in the data. This example is from the "Supplement"
program.
Program to Check Data for Skip Patterns
|
*--------------------------------------------------------------------;
* Use the PROC FREQ procedure to determine the frequency of each
;
* value of the variables listed. Use the TABLES statement to
list ;
* the variables to be included in the output frequency table and the
;
* cross tabulation frequency table for the skip patterns. Note
that ;
* an asterisk indicates that a crosstab will be constructed.
;
*--------------------------------------------------------------------;
proc
freq
data=DEMOOST;
where WTINT2YR >
0 and
RIAGENDR=2
and RIDAGEYR >= 20;
tables
OSQ060*OSQ070/list
missing;
title
'Check skip pattern for osteoporosis questionnaire';
run; |
Output of Program
Click here to view program output and highlights
|
Check
skip pattern for osteoporosis questionnaire
The FREQ
Procedure
Cumulative Cumulative
OSQ060 OSQ070 Frequency Percent Frequency
Percent
-------------------------------------------------------------------------------
. . 1 0.04
1 0.04
Yes Yes 232
8.84 233 8.88
Yes No 85 3.24 318
12.12
Yes 9 3 0.11 321
12.24
No . 2292 87.38 2613
99.62
9 . 10 0.38 2623
100.00 |
Highlighted items from the PROC FREQ output for skip patterns:
- • The output includes a cross tabulation of OSQ060 responses by OSQ070 responses. Note the large number of missing values in OSQ070 (n=2,292) for those who responded with a "No" for OSQ060. These participants were not asked OSQ070 because of a skip pattern. Therefore, these responses will need to be recoded, as demonstrated in Step 3, before the data are further analyzed.
Click here to close program output and highlights
Step 3: Create New Variables, As Necessary
Using the SAS IF, THEN, and ELSE statements in a
data step, you can create new variables derived from the values of the variables
in the skip pattern sequence. This example is from the "Supplement"
program.
For more
information, see Task 4: Create New Variables.
Program to Create a New Variable
|
*--------------------------------------------------------------;
* Create a new variable called TREATOSTEO based on responses ;
* to the variables OSQ060 and OSQ070. ;
*--------------------------------------------------------------;
data
DEMOOST;
set
DEMOOST;
if
OSQ070=1
then
treatOSTEO=1;
else
if
OSQ070=2
or OSQ060=2
then
treatOSTEO=2;
run;
*---------------------------------------------------------------;
* Use the PROC FREQ and TABLE statements to check the derived ;
* variable (TREATOSTEO) against the original variables (OSQ060 ;
* and
OSQ070). ;
*---------------------------------------------------------------;
Proc
freq
data=DEMOOST;
where
WTINT2YR > 0 and RIAGENDR=2 and RIDAGEYR >= 20;
table
TREATOSTEO*OSQ060*OSQ070/list
missing;
format
TREATOSTEO OSQ060 OSQ070
YESNO.;
title
'Check derived variable TREATOSTEO';
run; |
Output of Program
Click here to view program output and highlights
|
Check
derived variable TREATOSTEO
The FREQ
Procedure
Cumulative Cumulative
treatOSTEO OSQ060 OSQ070 Frequency Percent
Frequency Percent
----------------------------------------------------------------------------------------------
. . . 11
0.42 11 0.42
. Yes . 3
0.11 14 0.53
Yes Yes Yes 232
8.84 246 9.38
No Yes No 85
3.24 331 12.62
No No . 2292
87.38 2623 100.00 |
Highlighted items from the recode output for skip patterns:
- 232
participants are coded as "1" (or “Yes”) for the new variable
TREATOSTEO, which is
correct based on OSQ060 and OSQ070.
- Notice that 2,377 participants will be coded as “2” for the new variable TREATOSTEO -- 2,292 participants were not asked OSQ070 (i.e. OSQ070 is missing) because they answered “No” to OSQ060; 85 participants answered “Yes” to OSQ060 but “No” to OSQ070.
- Based on their responses to OSQ060 and OSQ070, 14 participants s still have missing values for the TREATOSTEO variable.
Click here to close program output and highlights