In this task, you will use SUDAAN to calculate a t-statistic and assess whether the mean age (ridageyr) for those who are on the 2005 Carrier File aged 65 and older is statistically different comparing participants who are obese (obese=1) and not obese (obese=0).

Follow the steps in the summary table below to produce the mean age using the SUDAAN procedure *proc descript*.

These programs use variable formats listed in the Tutorial Formats page. You may need to format the variables in your dataset the same way to reproduce results presented in the tutorial.

Statements | Explanation |
---|---|

proc sort data=DS1; run; |
Use the SUDAAN procedure, proc sort, to sort the data by strata (sdmvstra) and PSU (sdmvpsu). |

proc descript data=DS1 design=wr; |
Use the proc descript procedure to generate means and specify the sample design using the design option WR (with replacement). |

nest sdmvstra sdmvpsu; |
Use the nest statement with strata (sdmvstra) and PSU (sdmvpsu) to account for the design effects. |

Weight wt_linkage_adj; |
Use the weight statement to account for the unequal probability of sampling and nonresponse. In this example, the adjusted weight for “linkage non-response” is used for six years of data. |

subpopn ridageyr >= 65 and cms_medicare_match=1 and on_carrier_2005=1; |
Use a subpopn statement to subset on the subgroup of interest. In this example, it selects people aged 65 or older (ridageyr>=65) that linked to the Medicare files at some point during 1999-2007 (CMS_Medicare_match=1) and were on the 2005 Carrier File (on_carrier_2005=1).Because only those 65 years and older who linked to the 2005 Carrier File are of interest in this example, use the subpopn statement to select this subgroup. Please note that for accurate estimates of the standard error, it is preferable to use subpopn in SUDAAN to select a subgroup for analysis, rather than select the study subgroup in SAS when preparing the data file. (See Section 5.4 of Korn and Graubard Analysis of Data from Health Surveys, pp 207-211.) |

class obese/NoFREQ; |
Use a class statement for categorical variables in version 9.0. In earlier versions, you need a subgroup and levels statement. Use the nofreq option to suppress frequencies. |

var ridageyr; |
Use the var statement to choose the continuous variable, age (ridageyr). |

print nsum mean semean/style=nchs; |
Use the print statement to obtain the N (nsum), mean (mean) and standard error of the mean (semean) for the t-test. |

rformat obese obese_.; |
Use the rformat statement to read the SAS formats into SUDAAN. |

rtitle "Significance test for difference between mean age for those who were obese vs. not obese and on the 2005 Carrier File: NHANES 1999-2004 linked to Medicare 1999-2007"; run; |
Use the rtitle statement to title the output. |

- The results indicate the mean age was 74.5 for those who were not obese and on the 2005 Carrier File and 72.0 for those who were obese and on the file.

A t-test is used to test whether the mean age between those who were obese and on the 2005 Carrier File and those who were not obese obtained in the previous step is statistically significant different.

Request the t-test from the SUDAAN procedure proc descript and follow the steps in the summary table below.

Note that this program and the previous program to produce means in Step 1 are identical up to the *var *statement.

Statements | Explanation |
---|---|

proc sort data=DS1; run; |
Use the SUDAAN procedure, proc sort, to sort the data by strata (sdmvstra) and PSU (sdmvpsu). |

proc descript data=DS1 design=wr; |
Use the proc descript procedure to generate means and specify the sample design using the design option WR (with replacement). |

nest sdmvstra sdmvpsu; |
Use the nest statement with strata (sdmvstra) and PSU (sdmvpsu) to account for the design effects. |

Weight wt_linkage_adj; |
Use the weight statement to account for the unequal probability of sampling and nonresponse. In this example, the adjusted weight for “linkage non-response” is used for six years of data. |

subpopn ridageyr >= 65 and cms_medicare_match=1 and on_carrier_2005=1; |
Use a subpopn statement to subset on the subgroup of interest. In this example, it selects people aged 65 or older (ridageyr>=65) that linked to the Medicare files at some point during 1999-2007 (CMS_Medicare_match=1) and were on the 2005 Carrier File (on_carrier_2005=1).Because only those 65 years and older who linked to the 2005 Carrier File are of interest in this example, use the subpopn statement to select this subgroup. Please note that for accurate estimates of the standard error, it is preferable to use subpopn in SUDAAN to select a subgroup for analysis, rather than select the study subgroup in SAS when preparing the data file. (See Section 5.4 of Korn and Graubard Analysis of Data from Health Surveys, pp 207-211.) |

class obese/NoFREQ; |
Use a class statement for categorical variables in version 9.0. In earlier versions, you need a subgroup and levels statement. Use the nofreq option to suppress frequencies. |

var ridageyr; |
Use the var statement to choose the continuous variable, age (ridageyr). |

contrast obese = (1 -1)/name = "obese vs. not"; |
Use the contrast statement to test the hypothesis that the difference equal 0, or mean household size for males equals the mean household size for females. |

print nsum t_mean p_mean/style=nchs; |
Use the print statement to obtain the N (nsum), t-test, and p-value for the t-test. |

rformat obese obese_.; |
Use the rformat statement to read the SAS formats into SUDAAN. |

rtitle "Significance test for difference between mean age for those who were obese and vs. not obese and on 2005 Carrier File"; rtitle2 "NHANES 1999-2004 linked to Medicare 1999-2007"; run; |
Use the rtitle statement to title the output. |

- 2,095 observations were used in the analysis size where the degrees of freedom were 44.
- To test the hypothesis that the difference between the two means is zero, the t-statistic with 44 degrees of freedom is computed as 8.82. The p-value is <0.01, which indicates that the probability of obtaining a value of the t-statistic whose absolute value is greater than or equal to 8.82 is <0.01.
- Therefore, we reject the null hypothesis at the 0.05 level.

Korn, E.L. and B.I. Graubard. 1999. Analysis of Health Surveys. New York: Wiley.

Close Window to return to module page.