Task 2: How to Generate Confidence Intervals Using SUDAAN

In this example, you will be calculating means of dietary calcium intake.  The mean and its standard error are obtained directly from the PROC DESCRIPT procedure in SUDAAN and then output into a SAS dataset where the confidence intervals can be constructed.


Step 1: Sort Data

Before running any SUDAAN procedure, sort the data by strata and PSUs, using the PROC SORT procedure.


Step 2: Generate Means

Use the PROC DESCRIPT procedure to generate means.  Use the ATLEVEL1=1 and ATLEVEL2=2 options in the DATA statement to specify the sampling stages (in NHANES, the number of strata is level 1, and the number of PSUs is level 2) for which you want counts per table cell. ATLEV1 is the number of strata with at least one valid observation and ATLEV2 is the number of PSUs with at least one valid observation. These numbers are used to calculate degrees of freedom.

Use the NEST statement to account for the design effects of the survey and the WEIGHT statement to account for the unequal probability of sampling and non-response.  Use the SUBPOPN statement to select the subpopulation of interest.  Use a CLASS statement to list the discrete variables upon which subgroups are based and a VAR statement to list variables in the analysis.  Use the TABLE statement to obtain results for each gender.

The PRINT statement allows you to print the number of observations (NSUM), means (MEAN), and standard error of the mean (SEMEAN).  The OUTPUT statement outputs the number of observations (NSUM), means (MEAN), standard error of the mean (SEMEAN), number of strata (ATLEV1), and number of PSUs (ATLEV2) to a SAS file named CALC0304.


Calculate Mean Calcium Intake, in Milligrams, among Males and Females Ages 20 Years and Older Using SUDAAN

Sample Code

* Use the PROC SORT procedure to sort the data files by strata and PSU.   ;
* Data must always be sorted before running a SUDAAN procedure.           ;
*                                                                         ;
* Use the PROC DESCRIPT procedure to estimate the mean dietary calcium    ;
* intake (DR1TCALC) by gender (RIAGENDR) in males and females ages 20     ;
* and older.   These statistics will be output into a new dataset called   ;
* CALC0304 where the confidence intervals can be constructed directly.    ;

proc sort data =CALCMILK;
run ;

proc descript data=CALCMILK atlevel1= 1 atlevel2= 2 ;
      nest SDMVSTRA SDMVPSU;   
weight WTDRD1;
      subpopn RIDAGEYR >= 20 /name= "Adults 20 years of age and older" ;
      class RIAGENDR/nofreq;
      var DR1TCALC;
      table RIAGENDR;
      rformat RIAGENDR GENDER. ;
      print nsum mean semean/style=nchs meanfmt=f6.0 semeanfmt= f6.1 ;
      output nsum mean semean atlev1 atlev2/filename=CALC0304 replace;
run ;


Step 3: Create New Dataset

Use a DATA statement to create a new dataset called NEWCALC0304.  Calculate the degrees of freedom (DF) from the number of PSU (ATLEV2) minus the number of strata (ATLEV1).  Use a drop statement to drop selected variables from the dataset.  Use a series of statements to calculate the lower limit of the confidence interval (LL), upper limit of the confidence interval (UL), mean (MEAN), and width of the confidence intervals (CIWIDTH).  Use the proc print procedure to output these data.

Calculate Confidence Intervals of mean Calcium Intakes from SAS Output Dataset

Sample Code

* Create a new dataset called NEWCALC0304 which is based on the dataset   ;
* created in the last SUDAAN procedure.  Confidence intervals around the  ;
* means and standard errors will be calculated using this new dataset.    ;

data NEWCALC0304;
      set CALC0304;
      ll=round(mean+tinv(.025 ,df)*semean);
      ul=round(mean+tinv(.975 ,df)*semean);
      mean=round(mean);semean=round(semean,.1 );
run ;

* Use the PROC PRINT procedure to output the confidence intervals.        ;

proc print split = '/' noobs ;       format riagendr sex. nsum 7.0 mean 6.0 semean 6.1 df 2.0 ;
      label ll= 'Lower' / 'Limit' ul= 'Upper' / 'limit' df= 'Degrees' / 'of' / 'freedom'
      ciwidth='Confidence' / 'interval' / 'width' ;     
      title1 'Mean of dietary calcium intake and 95 % Confidence interval' ;
      title2 'of males and females ages 20 years and older' ;
run ;

Output of Program

Number of observations read    :   9034    Weighted count :286222757           
Number of observations skipped :   1088                                        
(WEIGHT variable nonpositive)                                                  
Observations in subpopulation  :   4448    Weighted count:205284669            
Denominator degrees of freedom :     15                                        
Variance Estimation Method: Taylor Series (WR)                                 
For Subpopulation: Adults 20 years of age and older                            
by: Variable, Gender - Adjudicated.                                            
   Gender -            Sample              SE                                  
     Adjudicated       Size         Mean   Mean                                
Calcium (mg)                                                                   
   Total                   4448      880     16.7                              
   Male                    2135      998     21.8                              
   Female                  2313      771     15.3                              

                                          Degrees                   Confidence 
 Gender -      Sample                SE     of      Lower   Upper    interval  
Adjudicated      Size     Mean     Mean   freedom   Limit   limit      width   
       0         4448      880     16.7     15       844      916       72     
  Male           2135      998     21.8     15       952     1045       93     
  Female         2313      771     15.3     15       738      803       65      

Highlights from the output include:

close window icon Close Window to return to module page.