In this module, you will use NHANES data to assess the association between several risk factors and the likelihood of having hypertension for participants 20 years and older. The dependent variable Y is hypertension, and the independent variables Xj, or covariates, are age, gender, high cholesterol, body mass index, and fasting triglycerides. In this task , you will only be reviewing the Multivariate Logistic Procedure.
In order to subset the data in SAS Survey Procedures, you will need to create a variable for the population of interest. You should not use a where clause or by-group processing in order to analyze a subpopulation with the SAS Survey Procedures.
In this example, the sel variable is set to 1 if the sample person is 20 years or older, and 2 if the sample person is younger than 20 years. Then this variable is used in the domain statement to specify the population of interest (those 20 years and older).
if ridageyr GE 20 then sel = 1;
else sel = 2;
This step introduces you to the SAS multivariate survey Logistic Regression procedure, proc surveylogistic. There is a summary table of the SAS program below.
|
|
| Statements | Explanation |
|---|---|
|
PROC
SURVEYLOGISTIC
DATA = Analysis_Data nomcar;
|
Use the proc surveylogistic procedure to perform multiple logistic regression to assess the association between hypertension and multiple risk factors, including: age, gender, high cholesterol, body mass index, and fasting triglycerides. Use the nomcar option to read all observations. |
| STRATUM sdmvstra; | Use the stratum statement to specify strata to account for design effects of stratification. |
|
CLUSTER sdmvpsu;
|
Use the cluster statement to specify primary sampling unit (PSU) to account for design effects of clustering. |
| WEIGHT wtsafyr; |
Use the weight statement to account for the unequal probability of sampling and non-response. In this example, the 4-year fasting weight variable is used. |
| DOMAIN sel; | Use the domain statement to specify the subpopulation of interest. |
|
CLASS
age (PARAM=REF REF='40-59
yrs')
riagendr (PARAM=REF REF='Female') hichol (PARAM=REF REF='high cholesterol') bmigrp (PARAM=REF REF='25<=BMI<30'); |
Use the class statement to specify all categorical variables in the model. Use the param and ref options to choose your reference group for the categorical variables. |
| MODEL hyper (desc)=age riagendr hichol bmigrp logtrig/ vadjust=none; |
Use the model statement to specify the dependent variable and all independent variable(s) in your Logistic Regression model. The vadjust option specifies whether or not to use variance adjustment. |
|
format
age
agefmt.
riagendr
sexfmt.
hichol
chfmt.
bmigrp
bmifmt.; run; |
Use the format statement to read the SAS formats for all formatted variables. |
|
|
In this step, the SAS output is reviewed. You can compare your results with the sample output, which you can download from the Sample Code and Datasets page. Or, you can view an animated version of the results with narration by clicking the link below. In the narration, the highlighted elements show that: