This task reviews how to recode variables so they are appropriate for your analytic needs and how to check your derived variables.
| Statements | Explanation |
|---|---|
|
data demo4_nh3; set demo3_nh3; |
Use the data and set statements to refer to your analytic dataset. |
|
if hfa8r>=0 then do; if hfa8r<12 then HIGHSCHL=1; else if hfa8r=12 then HIGHSCHL=2; else if hfa8r>12 then HIGHSCHL=3; else HIGHSCHL=.; end; |
Use the if, then, and else statements to create a simple high school education categorical variable (HIGHSCHL) from a complex categorical education variable. |
|
if (20 <= hsageir <= 39) and hsageu=2 then age3cat=1; else if (40 <= hsageir <= 59) and hsageu=2 then age3cat=2;else if hsageir >= 60 and hsageu=2 then age3cat=3; |
Use the if, then, and else statements statement to create an age categorical variable (age3cat) from a continuous variable. |
|
n_sbp = n(of pep6g1 pep6h1 pep6i1); |
Use these function statements to count the number of systolic and diastolic blood pressure readings. Then use the array statement (where _DBP is the name of the array) to set any diastolic blood pressure readings of "0" to missing, so that a reading of "0" does not affect the blood pressure means. |
|
mean_sbp = mean(of pep6g1 pep6h1 pep6i1); mean_dbp = mean(of pep6g3 pep6h3 pep6i3); |
Use these function statements to calculate mean systolic and diastolic blood pressures. |
|
if
n_sbp>0
and n_dbp>0
then
do; |
Use the if, then, and else statements to define a new variable, hbp (high blood pressure = 1 or 0), based on a series of conditions that indicate hypertension from the questionnaire and examination variables. |
|
if tcp>=240 then HLP_lab=1; else if tcp>=0 then HLP_lab=0; if HLP_lab>=0 and CHOLMED>=0 then do; if HLP_lab=1 or CHOLMED=1 then HLP=1; else HLP=0;
end; |
Use the if, then, and else statements to define a new variable, hlp (hyperlipidemia = 1 or 0), based on a series of conditions that indicate high lipid levels from the questionnaire and examination variables. |
In this step, you will check to confirm that derived and recoded variables correctly correspond to the original variables.
| Statements | Explanation |
|---|---|
|
proc
freq
data=demo4_nh3;
HBP*HTNMED*SBP140*DBP90/list
missing; |
Use the proc freq procedure to create a cross tabulation of the original categorical variables for high blood pressure and hyperlipidemia by their respective recoded variables. Use the where statement to select the participants who were age 20 years and older and who had both the home interview and the MEC exam. |
|
proc
means
data=demo4_nh3
N
min
max; where hsageu=2 and hsageir>=20 and dmpstat=2; var mean_SBP; class SBP140; title 'Check if SBP >=140 is defined correctly';
proc
means
data=demo4_nh3
N
min
max;
proc
means
data=demo4_nh3
N
min
max;
|
Use the proc means procedure to calculate the mean, minimum, and maximum values for the original continuous variables. Use the where statement to select the participants who were age 20 years and older and who had both the home interview and the MEC exam. The class statement will separate the original continuous variable into categories of the derived variables. This is done to check that coding of the derived variable, based on cut-off points of the continuous variable, is correct. |
Highlighted items comparing recoded or derived variables to original variables: