In this task, you will use Stata commands to calculate a tstatistic and assess whether the mean systolic blood pressures (SBP) in males and females age 20 years and older are statistically different.
Follow the steps in the summary table below to produce the mean SBP and the ttest to test whether the mean SBP between males and females obtained is statistically significant different using the Stata command svy:mean.
There are several things you should be aware of while analyzing NHANES data with Stata. Please see the Stata Tips page to review them before continuing.
Remember that you need to define the SVYSET before using the SVY series of commands. The general format of this command is below:
svyset [w=weightvar], psu(psuvar) strata(stratavar) vce(linearized)
To define the survey design variables for your SBP analysis, use the weight variable for 4 years of MEC data (wtmec4yr), the PSU variable (sdmvpsu), and strata variable (sdmvstra) .The vce option specifies the method for calculating the variance and the default is "linearized" which is Taylor linearization. Here is the svyset command for four years of MEC data:
svyset [w= wtmec4yr], psu(sdmvpsu) strata(sdmvstra) vce(linearized)
Now, that the svyset has been defined you can use the Stata command, svy: mean, to generate means and standard errors. The general command for obtaining weighted means and standard errors of a subpopulation is below.
svy: mean varname, subpop(if condition)
Use the svy : mean command with the systolic blood pressure variable (bpxsar) to estimate the mean systolic blood pressure for people age 20 years and older. Use the subpop( ) option to select a subpopulation for analysis, rather than select the study population in the Stata program while preparing the data file. This example uses an if statement to define the subpopulation based on the age variable's (ridageyr) value. Another option is to create a dichotomous variable where the subpopulation of interest is assigned a value of 1, and everyone else is assigned a value of 0.
svy: mean bpxsar, subpop(if ridageyr>=20 & ridageyr<.)
You can also add the over() option to the svy:mean command to generate the means for different subgroups. When you do this, you can type a second command, estat size, to have the output display the subgroup observation numbers. Here is the general format of these commands for this example:
svy: mean varname, subpop(if condition) over(var1 var2)
estat size
Use the svy : mean command with the systolic blood pressure variable (bpxsar) to estimate the mean systolic blood pressure for people age 20 years and older. Use the subpop( ) option to select a subpopulation for analysis, rather than select the study population in the Stata program while preparing the data file. This example uses an if statement to define the subpopulation based on the age variable's (ridageyr) value. Another option is to create a dichotomous variable where the subpopulation of interest is assigned a value of 1, and everyone else is assigned a value of 0. Use the over option to get stratified results. This example produces estimates by gender. Use the estate size post estimation command to display the number of subpopulation observations and weighted numbers.
svy: mean bpxsar, subpop(if ridageyr>=20 & ridageyr<.) over(riagendr)
estat size, obs size
If you have already done some estimations, then you can use the lincom command to test the hypothesis that the difference between the mean for the subpopulations equal 0. Use square brackets around the variable you are estimating. After the variables in square brackets, put the stratifier that you want to test (e.g. the variable in the over option). If you used labels for the variable, you can use labels instead of the coded values. Here is the general format of these commands for this example:
lincom [varname]stratval1  [varname]stratval2
Because you have done some prior estimation, you can use the lincom post estimation command to test the hypothesis that the difference between mean SBP (bpxsar) for males and females equal 0. This example uses labeled values (male, female) instead of the coded values (1,2) for the gender variable (riagendr).
lincom [bpxsar]male  [bpxsar]female
The svy:reg command could also be used to calculate the tstatistic. The difference between using svy:reg and lincom is that svy:reg can be used without prior estimation. The xi prefix is used before the command to denote a categorical variable and the i prefix before categorical variables. Here is the general format of these commands for this example:
xi: svy, subpop(if condition): reg dependentvar i.varname
Use the svy:reg command with the xi prefix to calculate the tstatistic and assess whether the mean SBP (bpxsar) for males and females age 20 years and older are statistically different. The i prefix denotes the categorical variable, which in this example is riagendr. Use the char function choose the reference group for the categorical variable.
char riagendr[omit]2
xi:svy, subpop(if ridageyr.=20 & ridageyr<.):reg bpxsar i.riagendr,
Here a table summarizing the results of the previous analyses:
Variable  Subpopulation analyzed  Number of respondents with data 
Mean  p value 

Systolic blood

Adults age 20 and older 
9,056 
123 
n/a 
Men age 20 and older 
4,301 
124 
0.0132


Women age 20 and older 
4755 
122 
According to the stratified analysis, men's mean blood pressure is 2 points higher than women's. This difference is statistically significant (i.e. a difference this big or bigger would happen just by chance (in a sample of this size) only 1.3% of the time). 9,056 respondents had information on systolic blood pressure (SBP).