Task 3b: How to Generate Means Using SAS Survey Procedures

In this example, you will use SAS Survey Procedures to generate tables of means and standard errors for average cholesterol levels of persons 20 years and older, by gender and race-ethnicity.


Step 1: Create Variable to Subset Population

In order to subset the data in SAS Survey Procedures, you will need to create a variable for the population of interest. In this example, the sel variable is set to 1 if the sample person is 20 years or older, and 2 if the sample person is younger than 20 years. Then this variable is used in the domain statement to specify the population of interest (those 20 years and older).

if ridageyr GE 20 then sel = 1;

else sel = 2;


Step 2:  Use proc surveymeans to generate means in SAS Survey Procedures

The SAS procedure, proc surveymeans, is used to generate means and standard errors. The general program for obtaining weighted means and standard errors is below.



These programs use variable formats listed in the Tutorial Formats page. You may need to format the variables in your dataset the same way to reproduce results presented in the tutorial.

Generate Means in SAS Survey Procedures
Statements Explanation
proc surveymeans data=ANALYSIS_DATA nobs mean stderr;

Use the proc surveymeans procedure to obtain number of observations, mean, and standard error.

stratum sdmvstra;

Use the stratum statement to define the strata variable (sdmvstra).

cluster sdmvpsu;  Use the cluster statement to define the PSU variable (sdmvpsu).
class riagendr age;

Use the class statement to specify the discrete variables used to select from the subpopulations of interest. In this example, the subpopulation of interest are gender (riagendr) and age (age).

var lbxtc; 


Use the var statement to name the variable(s) to be analyzed. In this example, the total cholesterol variable (lbxtc) is used.

weight wtmec4yr;

Use the weight statement to account for the unequal probability of sampling and non-response. In this example, the MEC weight for four years of data (wtmec4yr) is used.

domain sel sel*riagendr*age;

Use the domain statement to specify the subpopulations of interest.

ods output domain(match_all)=domain;
run ;

Use the ods statement to output the dataset of estimates from the subdomains listed on the domain statement above. This set of commands will output two datasets for each subdomain specified in the domain statement above (domain for sel; domain1 for sel*riagendr*age).

data all;

set domain domain1;
if sel= 'Age ge 20' ;

run ;

Use the data statement to name the temporary SAS dataset (all) append the two datasets, created in the previous step, if age is greater than or equal to 20 (sel).

proc print noobs data =all split = '/';

var riagendr age N mean stderr;

format n 5.0 mean 4.2 stderr 4.2 ;

label N = 'Sample'/'Size'

stderr='Standard'/'error'/'of the' / 'mean'


title1 'Mean serum total cholesterol of adults 20 years and older, 1999-2002' ;

run ;

Use the print statement to print the number of observations, the mean, and standard error of the mean in a printer-friendly format.



Step 3: Review output

The output lists the sample sizes, means and their standard errors.



close window icon Close Window