Task 1b: How to Estimate Prevalence of Supplement Use Using Proportions Using SAS Survey Procedures

In this example, to determine the prevalence rate of calcium supplement use among older adults in the U.S., you will identify women and men age 50 years and older who report calcium supplement use on the household interview.


Step 1: Determine variables of interest

This example uses the demoadv dataset (download at Sample Code and Datasets).  This dataset contains a created variable called anycalsup that has a value of 1 for those who report calcium supplement use, and a value of 2 for those who do not. A participant was considered not to have any calcium supplement use if the daily average amount of calcium supplement use was zero; otherwise, a participant was considered a supplement user (see Supplement Code under Sample Code and Module 9, Task 4 for more information). You will need to define and create a categorical variable calcium indicating whether persons report supplement use (100 = calcium supplement use; 0 = no calcium supplement use).


Step 2: Create Variable to Subset Population

In order to subset the data in SAS Survey Procedures, you will need to create a variable for the population of interest. In this example, the sel variable is set to 1 if the sample person is age 50 years or older, and 2 if the sample person is younger than age 50 years. Then this variable is used in the domain statement to specify the population of interest (those ages 50 years and older).


Step 3:  Use proc surveymeans to generate proportions and their standard errors in SAS

In the SAS surveymeans procedure, persons who report calcium supplement use, as defined above, are assigned a value of 100, and persons who do not report supplement use are assigned a value of 0. The weighted mean of sample persons with a value equal to 100 or 0 (which will be expressed as a percent) is an estimate of the prevalence of calcium supplement use in the U.S.

Generate Proportions in SAS Survey Procedures

Statements Explanation

proc surveymeans data =demoadv nobs mean stderr;

Use the surveymeans procedure to obtain the number of observations, mean, and standard error.

stratum sdmvstra;

Use the stratum statement to define the strata variable (sdmvstra).

cluster sdmvpsu;     

Use the cluster statement to define the PSU variable (sdmvpsu).

class riagendr;

Use the class statement to specify the discrete variables used to form the subpopulations of interest. In this example, the subpopulations of interest are specified by gender (riagendr).

domain sel sel*riagendr;

Use the domain statement to specify the table layout to form the subpopulations of interest. This example uses age greater than or equal to 50 years (sel) by gender (riagendr).

var calcium;

Use the var statement to name the variable(s) to be analyzed. In this example, the calcium supplement use variable (calcium) is used. If the sample person reports calcium supplement use, then the value equals 100.  Otherwise, the variable equals 0.


The SAS Survey procedure, proc surveymeans, uses the variable coded as 100 and 0 to obtain weighted means expressed as percentages.

weight wtint2yr;

Use the weight statement to account for the unequal probability of sampling and non-response. In this example, the interview weight for 2 years of data (wtint2yr) is used.

ods output domain(match_all)=domain;

run ;  

Use the ods statement to output the dataset of estimates from the subdomains listed on the domain statement above. This set of commands will output two datasets for each subdomain specified in the domain statement above (domain for sel; domain1 for sel*riagendr).

data all;

  set domain domain1;

  if sel= 1 ;

run ;

Use the data statement to name the temporary SAS dataset (all), append the two datasets (created in the previous step) with the set statement , and subset those participatnts with age greater than or equal to 50 years (sel).


proc print noobs data =all split = '/' ;

  var riagendr N mean stderr;

  format n 5.0 mean 7.4 stderr 6.4 ;

  label N= 'Sample' / 'size'   mean= 'Percent'

   stderr= 'Standard' / 'error' / 'of the ' / 'percent' ;

  title1 'Percent of adults 50 years and older who report calcium supplement use' ;

run ;

Use the print procedure to print the number of observations, the mean, and standard error of the mean in a printer- friendly format.


Step 4: Review output

The percentages in the output are the estimated proportions of persons ages 50 years and older in the target population who consume calcium supplements.


close window icon Close Window to return to module page.