NHANES-CMS Linked Data Tutorial: Hypothesis Testing when using the NHANES-CMS linked data : Task 1

In this task, you will use SUDAAN to calculate a t-statistic and assess whether the mean age (ridageyr) for those who are on the 2005 Carrier File aged 65 and older is statistically different comparing participants who are obese (obese=1) and not obese (obese=0).

Step 1:Set Up SUDAAN to Produce Means

Follow the steps in the summary table below to produce the mean age using the SUDAAN procedure proc descript.

Information

These programs use variable formats listed in the Tutorial Formats page. You may need to format the variables in your dataset the same way to reproduce results presented in the tutorial.

SUDAAN *proc descript* Procedure for Means
Statements	Explanation
proc sort data=DS1; by sdmvstra sdmvpsu; run;	Use the SUDAAN procedure, proc sort, to sort the data by strata (sdmvstra) and PSU (sdmvpsu).
proc descript data=DS1 design=wr;	Use the proc descript procedure to generate means and specify the sample design using the design option WR (with replacement).
nest sdmvstra sdmvpsu;	Use the nest statement with strata (sdmvstra) and PSU (sdmvpsu) to account for the design effects.
Weight wt_linkage_adj;	Use the weight statement to account for the unequal probability of sampling and nonresponse. In this example, the adjusted weight for “linkage non-response” is used for six years of data.
subpopn ridageyr >= 65 and cms_medicare_match=1 and on_carrier_2005=1;	Use a subpopn statement to subset on the subgroup of interest. In this example, it selects people aged 65 or older (ridageyr>=65) that linked to the Medicare files at some point during 1999-2007 (CMS_Medicare_match=1) and were on the 2005 Carrier File (on_carrier_2005=1). Because only those 65 years and older who linked to the 2005 Carrier File are of interest in this example, use the subpopn statement to select this subgroup. Please note that for accurate estimates of the standard error, it is preferable to use subpopn in SUDAAN to select a subgroup for analysis, rather than select the study subgroup in SAS when preparing the data file. (See Section 5.4 of Korn and Graubard Analysis of Data from Health Surveys, pp 207-211.)
class obese/NoFREQ;	Use a class statement for categorical variables in version 9.0. In earlier versions, you need a subgroup and levels statement. Use the nofreq option to suppress frequencies.
var ridageyr;	Use the var statement to choose the continuous variable, age (ridageyr).
print nsum mean semean/style=nchs;	Use the print statement to obtain the N (nsum), mean (mean) and standard error of the mean (semean) for the t-test.
rformat obese obese_.;	Use the rformat statement to read the SAS formats into SUDAAN.
rtitle "Significance test for difference between mean age for those who were obese vs. not obese and on the 2005 Carrier File: NHANES 1999-2004 linked to Medicare 1999-2007"; run;	Use the rtitle statement to title the output.

SUDAAN proc descript Procedure for Means

Statements

Explanation

proc sort data=DS1;
by sdmvstra sdmvpsu;

run;

Use the SUDAAN procedure, proc sort, to sort the data by strata (sdmvstra) and PSU (sdmvpsu).

proc descript

data=DS1 design=wr;

Use the proc descript procedure to generate means and specify the sample design using the design option WR (with replacement).

nest sdmvstra sdmvpsu;

Use the nest statement with strata (sdmvstra) and PSU (sdmvpsu) to account for the design effects.

Weight wt_linkage_adj;

Use the weight statement to account for the unequal probability of sampling and nonresponse. In this example, the adjusted weight for “linkage non-response” is used for six years of data.

subpopn ridageyr >= 65 and cms_medicare_match=1 and on_carrier_2005=1;

Use a subpopn statement to subset on the subgroup of interest. In this example, it selects people aged 65 or older (ridageyr>=65) that linked to the Medicare files at some point during 1999-2007 (CMS_Medicare_match=1) and were on the 2005 Carrier File (on_carrier_2005=1).
Because only those 65 years and older who linked to the 2005 Carrier File are of interest in this example, use the subpopn statement to select this subgroup. Please note that for accurate estimates of the standard error, it is preferable to use subpopn in SUDAAN to select a subgroup for analysis, rather than select the study subgroup in SAS when preparing the data file. (See Section 5.4 of Korn and Graubard Analysis of Data from Health Surveys, pp 207-211.)

class obese/NoFREQ;

Use a class statement for categorical variables in version 9.0. In earlier versions, you need a subgroup and levels statement. Use the nofreq option to suppress frequencies.

var ridageyr;

Use the var statement to choose the continuous variable, age (ridageyr).

print nsum mean semean/style=nchs;

Use the print statement to obtain the N (nsum), mean (mean) and standard error of the mean (semean) for the t-test.

rformat obese obese_.;

Use the rformat statement to read the SAS formats into SUDAAN.

rtitle "Significance test for difference between mean age for those who were obese vs. not obese and on the 2005 Carrier File: NHANES 1999-2004 linked to Medicare 1999-2007";

run;

Use the rtitle statement to title the output.

Step 2: Review SUDAAN Means Output

Step 3: Perform t-test to Test for Significance

A t-test is used to test whether the mean age between those who were obese and on the 2005 Carrier File and those who were not obese obtained in the previous step is statistically significant different.

Request the t-test from the SUDAAN procedure proc descript and follow the steps in the summary table below.

Information

Note that this program and the previous program to produce means in Step 1 are identical up to the var statement.

SUDAAN Procedure for Significance Test
Statements	Explanation
proc sort data=DS1; by sdmvstra sdmvpsu; run;	Use the SUDAAN procedure, proc sort, to sort the data by strata (sdmvstra) and PSU (sdmvpsu).
proc descript data=DS1 design=wr;	Use the proc descript procedure to generate means and specify the sample design using the design option WR (with replacement).
nest sdmvstra sdmvpsu;	Use the nest statement with strata (sdmvstra) and PSU (sdmvpsu) to account for the design effects.
Weight wt_linkage_adj;	Use the weight statement to account for the unequal probability of sampling and nonresponse. In this example, the adjusted weight for “linkage non-response” is used for six years of data.
subpopn ridageyr >= 65 and cms_medicare_match=1 and on_carrier_2005=1;	Use a subpopn statement to subset on the subgroup of interest. In this example, it selects people aged 65 or older (ridageyr>=65) that linked to the Medicare files at some point during 1999-2007 (CMS_Medicare_match=1) and were on the 2005 Carrier File (on_carrier_2005=1). Because only those 65 years and older who linked to the 2005 Carrier File are of interest in this example, use the subpopn statement to select this subgroup. Please note that for accurate estimates of the standard error, it is preferable to use subpopn in SUDAAN to select a subgroup for analysis, rather than select the study subgroup in SAS when preparing the data file. (See Section 5.4 of Korn and Graubard Analysis of Data from Health Surveys, pp 207-211.)
class obese/NoFREQ;	Use a class statement for categorical variables in version 9.0. In earlier versions, you need a subgroup and levels statement. Use the nofreq option to suppress frequencies.
var ridageyr;	Use the var statement to choose the continuous variable, age (ridageyr).
contrast obese = (1 -1)/name = "obese vs. not";	Use the contrast statement to test the hypothesis that the difference equal 0, or mean household size for males equals the mean household size for females.
print nsum t_mean p_mean/style=nchs;	Use the print statement to obtain the N (nsum), t-test, and p-value for the t-test.
rformat obese obese_.;	Use the rformat statement to read the SAS formats into SUDAAN.
rtitle "Significance test for difference between mean age for those who were obese and vs. not obese and on 2005 Carrier File"; rtitle2 "NHANES 1999-2004 linked to Medicare 1999-2007"; run;	Use the rtitle statement to title the output.

SUDAAN Procedure for Significance Test

Statements

Explanation

proc sort data=DS1;
by sdmvstra sdmvpsu;

run;

Use the SUDAAN procedure, proc sort, to sort the data by strata (sdmvstra) and PSU (sdmvpsu).

proc descript

data=DS1 design=wr;

Use the proc descript procedure to generate means and specify the sample design using the design option WR (with replacement).

nest sdmvstra sdmvpsu;

Use the nest statement with strata (sdmvstra) and PSU (sdmvpsu) to account for the design effects.

Weight wt_linkage_adj;

Use the weight statement to account for the unequal probability of sampling and nonresponse. In this example, the adjusted weight for “linkage non-response” is used for six years of data.

subpopn ridageyr >= 65 and cms_medicare_match=1 and on_carrier_2005=1;

class obese/NoFREQ;

Use a class statement for categorical variables in version 9.0. In earlier versions, you need a subgroup and levels statement. Use the nofreq option to suppress frequencies.

var ridageyr;

Use the var statement to choose the continuous variable, age (ridageyr).

contrast obese = (1 -1)/name = "obese vs. not";

Use the contrast statement to test the hypothesis that the difference equal 0, or mean household size for males equals the mean household size for females.

print nsum t_mean p_mean/style=nchs;

Use the print statement to obtain the N (nsum), t-test, and p-value for the t-test.

rformat obese obese_.;

Use the rformat statement to read the SAS formats into SUDAAN.

rtitle "Significance test for difference between mean age for those who were obese and vs. not obese and on 2005 Carrier File";

rtitle2 "NHANES 1999-2004 linked to Medicare 1999-2007";

run;

Use the rtitle statement to title the output.

Step 4: Review SUDAAN t-test Output

Resources

Korn, E.L. and B.I. Graubard. 1999. Analysis of Health Surveys. New York: Wiley.

Close Window to return to module page.

Task 1e: How to Set Up a t-test using NHANES-CMS linked data in SUDAAN

Step 1:Set Up SUDAAN to Produce Means

Information

Step 2: Review SUDAAN Means Output

Step 3: Perform t-test to Test for Significance

Information

Step 4: Review SUDAAN t-test Output

Resources