Task 1: How to Set Up a T-Test in NHANES Using SUDAAN

In this task, you will use SUDAAN to calculate a t-statistic and assess whether the mean calcium intake in males versus females ages 20 and older is statistically different.

Step 1: Sort Data

Before running any SUDAAN procedure, sort the data by strata and PSUs, using the PROC SORT procedure.

Step 2: Compute Properly Weighted Estimated Means

Use the PROC DESCRIPT procedure to generate means and specify the sample design using the design option WR (with replacement).  Use the NEST statement with strata and PSU to account for the design effects, and the WEIGHT statement to account for the unequal probability of sampling and non-response.  The SUBPOPN statement is used to select the population of interest.  Note that for accurate estimates of the standard error, it is preferable to use the SUBPOPN in SUDAAN to select a subgroup for analysis, rather than select the study subgroup in SAS when preparing the data file.  Use a CLASS statement to define the categorical variables in the analysis and the NOFREQ option to suppress frequencies. Use the VAR statement to choose the continuous variable for mean calcium.

Sample Code

*-------------------------------------------------------------------------;
* Use the PROC SORT procedure to sort the data files by strata and PSU.   ;
* Data must always be sorted before running a SUDAAN procedure.           ;
*                                                                         ;
* Use the PROC DESCRIPT procedure to estimate the mean dietary calcium    ;
* intake (DR1TCALC) by gender (RIAGENDR) in males and females ages 20     ;
* and older.                                                              ;
*-------------------------------------------------------------------------;

proc sort data =CALCMILK;
< by SDMVSTRA SDMVPSU;
run ;

proc descript data=CALCMILK design=wr;
nest SDMVSTRA SDMVPSU;
weight WTDRD1;
subpopn RIDAGEYR >= 20 ;
class RIAGENDR/nofreq;
var DR1TCALC;
print nsum mean semean/style=nchs;
rformat RIAGENDR GENDER. ;
rtitle "Mean dietary calcium intake in males and females >= 20 years"
"of age"
;
run ;

Output of Program

```
Mean dietary calcium intake in males and females >= 20 years of age

Number of observations read    :   9034    Weighted count :286222757
Number of observations skipped :   1088
(WEIGHT variable nonpositive)
Observations in subpopulation  :   4448    Weighted count:205284669
Denominator degrees of freedom :     15

Variance Estimation Method: Taylor Series (WR)
For Subpopulation: RIDAGEYR >= 20
Mean dietary calcium intake in males and females >= 20 years of age

---------------------------------------------------------
Variable
Gender -            Sample
---------------------------------------------------------
Calcium (mg)
Total                   4448       880.13        16.72
Male                    2135       998.36        21.81
Female                  2313       770.73        15.29
---------------------------------------------------------
```

Highlights from the output include:

• 4,448 respondents ages 20 and older were included in this analysis; 2,135 respondents were male and 2,313 were female.
• The mean calcium intake for males was 998.36 mg and the mean calcium intake for females was 770.73 mg.

Step 3: Use a t-test to Test for Significance

In this case, a t-test is used to test whether the mean calcium intake by males is statistically different from the mean calcium intake by females.  Note that the program below and the program presented in Step 1 are identical except for the CONTRAST statement.  The CONTRAST statement is used to test the hypothesis that the difference in the means is equal to 0.  In other words, the mean calcium intake by males is equal to that by females.

Sample Code

*-------------------------------------------------------------------------;
* Use the PROC SORT procedure to sort the data files by strata and PSU.   ;
* Data must always be sorted before running a SUDAAN procedure.           ;
*                                                                         ;
* Use the PROC DESCRIPT procedure and the CONTRAST statement to perform a ;
* t-test.  This will test whether the mean dietary calcium intake         ;
* (DR1TCALC) in males and females is significantly different.             ;
*-------------------------------------------------------------------------;

proc sort data =CALCMILK;
by SDMVSTRA SDMVPSU;
run ;

proc descript data=CALCMILK design=wr;
nest SDMVSTRA SDMVPSU;
weight WTDRD1;
subpopn RIDAGEYR >= 20 ;
class RIAGENDR/nofreq;
var DR1TCALC;
contrast RIAGENDR = ( 1 - 1 )/name = "Males vs. Females" ;
print nsum t_mean p_mean/style=nchs;
rformat RIAGENDR GENDER. ;
rtitle "Mean dietary calcium intake in males and females >= 20 years"
"of age"
;
run ;

Output of Program

```
Number of observations read    :   9034    Weighted count :286222757
Number of observations skipped :   1088
(WEIGHT variable nonpositive)
Observations in subpopulation  :   4448    Weighted count:205284669
Denominator degrees of freedom :     15

Variance Estimation Method: Taylor Series (WR)
For Subpopulation: RIDAGEYR >= 20
Mean dietary calcium intake in males and females >= 20 years
of age
by: Variable, One, Contrast.

for: Variable = Calcium (mg).

-------------------------------------------------------
One                                            P-value
Contrast                       T-Test       T-Test
Sample     Cont.Mean-   Cont.
Size       =0           Mean=0
-------------------------------------------------------
Total
Males vs. Females       4448        13.03     0.0000
1
Males vs. Females       4448        13.03     0.0000
-------------------------------------------------------
```

Highlights from the output include:

• 4,448 respondents were included in this analysis.
• The null hypothesis is that there is no relationship between calcium intake and gender, or that the mean calcium intake for males equals the mean calcium intake for females.  To test this hypothesis, the t-statistic is computed as 13.03. The p-value is 0.0000.
• Therefore, the null hypothesis is rejected at the 0.05 level and it is concluded that the mean calcium intake by males does not equal the mean calcium intake by females.