To calculate the means and standard errors, you will use SAScallable SUDAAN because this software takes into account the complex survey design of NHANES data when determining variance estimates. Note that if standard errors are not needed, you can simply use a SAS procedure, i.e., proc means with the weight statement to calculate means. The data from analysis_Data must be sorted by strata first and then PSU (unless the data have already been sorted by PSU within strata). The SAS proc sort statement must precede the SUDAAN statements.
The design variables, sdmvstra and sdmvpsu, are provided in the demographic data files and are used to calculate variance estimates. Before you call SUDAAN into SAS, the data must be sorted by these variables.
The SUDAAN procedure, proc descript, is used to generate means and standard errors. The print statement is used to output those estimates along with the sample size (nsum), i.e., the number of survey participants with known values for the variable of interest. The general program for obtaining weighted means and standard errors is below.
These programs use variable formats listed in the Tutorial Formats page. You may need to format the variables in your dataset the same way to reproduce results presented in the tutorial.
Statements  Explanation 

DATA BY

Use the proc sort procedure to sort the dataset by strata (sdmvstra) and PSU (sdmvpsu). The data statement refers to the dataset, analysis_Data. 
Use the proc descript procedure to generate means and specify the sample design using the design option WR (with replacement). 

20 
Use the subpopn statement to select the sample persons 20 years and older (ridageyr >=20) because only those individuals are of interest in this example. Please note that for accurate estimates, it is preferable to use subpopn in SUDAAN to select a subpopulation for analysis, rather than select the study population in the SAS program while preparing the data file. 
Use the nest statement with strata (sdmvstra) and PSU (sdmvpsu) to account for the design effects. 

weight 
Use the weight statement to account for the unequal probability of sampling and nonresponse. In this example, the MEC weight for four years of data (wtmec4yr) is used. 
Use the subgroup statement to list the categorical variables for which statistics are requested. This example uses gender (riagendr) and age (age). These variables also appear in the table statement. 

2 3 
Use the levels statement to define the number of categories in each of the subgroup variables. The level must be an integer greater than 0. This example uses two genders and three age groups. 
var 
Use the var statement to name the variable(s) to be analyzed. In this example, the total cholesterol variables (lbxtc) is used. 
table 
Use the table statement to specify crosstabulations for which estimates are requested. If a table statement is not present, a one—dimensional distribution is generated for each variable in the subgroup statement. In this example the estimates are for gender (riagendr) by age (age). 
"Sample Size" "Mean" "Standard Error"
F7.0 F9.2 F9.3

Use the print statement to assign names, format the statistics desired, and view the output. If the statement print is used alone, all of the default statistics are printed with default labels and formats. In this example, the sample size (nsum), mean (mean), and standard error of the mean (semean) are requested. Note: For a complete list of statistics that can be requested on the print statement see SUDAAN Users Manual.
Use the style option equal to NCHS to produce output that parallels a table style used at NCHS.

"Means of total cholesterol and standard errors by sex and age: NHANES 19992002" 
Use the rtitle statement to assign a heading for each page of output. 
The run statement signifies the end of the program. 
The output will list the sample sizes, means, and their standard errors.
If you need to generate geometric means instead of arithmetic means, you would indicated this using options in the proc descript procedure, as shown below.
The example below is for illustrative purposes only. Geometric means are not recommended for use with normally distributed data, such as the analysis_Data dataset.
These programs use variable formats listed in the Tutorial Formats page. You may need to format the variables in your dataset the same way to reproduce results presented in the tutorial.
Statements  Explanation 

DATA BY

Use the proc sort procedure to sort the dataset by strata (sdmvstra) and PSU (sdmvpsu). The data statement refers to the dataset, analysis_data. 
geometric 
Use the proc descript procedure to generate means and specify geometric as an option to compute geometric means. Specify the sample design using the design option WR (with replacement). 
20 
Use the subpopn statement to select sample persons 20 years and older (ridageyr >=20) because only those individuals are of interest in this example. Please note that for accurate estimates, it is preferable to use subpopn in SUDAAN to select a subpopulation for analysis, rather than select the study population in the SAS program while preparing the data file. 
Use the nest statement with strata (sdmvstra) and PSU (sdmvpsu) to account for the design effects. 

weight 
Use the weight statement to account for the unequal probability of sampling and nonresponse. In this example, the MEC weight for 4 years of data (wtmec4yr) is used. 
Use the subgroup statement to list the categorical variables for which statistics are requested. This example uses gender (riagendr) and age (age). These variables will also appear in the table statement. 

2 3 
Use the levels statement to define the number of categories in each of the subgroup variables. The level must be an integer greater than 0. This example uses two genders and three age groups. 
var 
Use the var statement to name the variable(s) to be analyzed. In this example, the total cholesterol variables (lbxtc) is used. 
table 
Use the table statement to specify crosstabulations for which estimates are requested. If a table statement is not present, a one—dimensional distribution is generated for each variable on the subgroup statement. This example uses the estimates for gender (riagendr) by age (age). 
"Sample Size" "Geometric Mean" "Standard Error"
F7.0 F9.2 F9.3 output nsum geomean segeomean; 
Use the print statement to assign names, format the statistics desired, and view the output. If the statement print is used alone, all of the default statistics are printed with default labels and formats. In this example, the sample size (nsum), geometric mean (geomean), and standard error of the geometric mean (segeomean) were requested. Note: For a complete list of statistics that can be requested on the print statement see SUDAAN Users Manual. Use the style option equal to NCHS to produce output that parallels a table style used at NCHS. 
"Geometric means of total cholesterol and standard errors by sex and age: NHANES 19992002"

Use the rtitle statement to assign a title (heading) to each page of output. 