In this example, you will be calculating means of dietary calcium intake. The mean and its standard error are obtained directly from the PROC DESCRIPT procedure in SUDAAN and then output into a SAS dataset where the confidence intervals can be constructed.
Before running any SUDAAN procedure, sort the data by strata and PSUs, using the PROC SORT procedure.
Use the PROC DESCRIPT procedure to generate means. Use the ATLEVEL1=1 and ATLEVEL2=2 options in the DATA statement to specify the sampling stages (in NHANES, the number of strata is level 1, and the number of PSUs is level 2) for which you want counts per table cell. ATLEV1 is the number of strata with at least one valid observation and ATLEV2 is the number of PSUs with at least one valid observation. These numbers are used to calculate degrees of freedom.
Use the NEST statement to account for the design effects of the survey and the WEIGHT statement to account for the unequal probability of sampling and non-response. Use the SUBPOPN statement to select the subpopulation of interest. Use a CLASS statement to list the discrete variables upon which subgroups are based and a VAR statement to list variables in the analysis. Use the TABLE statement to obtain results for each gender.
The PRINT statement allows you to print the number of observations (NSUM), means (MEAN), and standard error of the mean (SEMEAN). The OUTPUT statement outputs the number of observations (NSUM), means (MEAN), standard error of the mean (SEMEAN), number of strata (ATLEV1), and number of PSUs (ATLEV2) to a SAS file named CALC0304.
|
*-------------------------------------------------------------------------;
proc
sort
data=CALCMILK;
proc
descript
data=CALCMILK atlevel1=1
atlevel2=2; |
Use a DATA statement to create a new dataset called NEWCALC0304. Calculate the degrees of freedom (DF) from the number of PSU (ATLEV2) minus the number of strata (ATLEV1). Use a drop statement to drop selected variables from the dataset. Use a series of statements to calculate the lower limit of the confidence interval (LL), upper limit of the confidence interval (UL), mean (MEAN), and width of the confidence intervals (CIWIDTH). Use the proc print procedure to output these data.
|
*-------------------------------------------------------------------------;
data
NEWCALC0304;
*-------------------------------------------------------------------------;
proc
print
split='/'
noobs;
|
Number of observations read : 9034 Weighted count :286222757
Number of observations skipped : 1088
(WEIGHT variable nonpositive)
Observations in subpopulation : 4448 Weighted count:205284669
Denominator degrees of freedom : 15
Variance Estimation Method: Taylor Series (WR)
For Subpopulation: Adults 20 years of age and older
by: Variable, Gender - Adjudicated.
-------------------------------------------------
Variable
Gender - Sample SE
Adjudicated Size Mean Mean
-------------------------------------------------
Calcium (mg)
Total 4448 880 16.7
Male 2135 998 21.8
Female 2313 771 15.3
-------------------------------------------------
Degrees Confidence
Gender - Sample SE of Lower Upper interval
Adjudicated Size Mean Mean freedom Limit limit width
0 4448 880 16.7 15 844 916 72
Male 2135 998 21.8 15 952 1045 93
Female 2313 771 15.3 15 738 803 65
|
Highlights from the output include: