Task 4b: How to Generate Proportions using SAS Survey Procedures

In this example, you will be looking at the proportion of examined persons 20 years and older with measured high blood pressure, by sex, age, and race-ethnicity.

Step 1: Determine variables of interest

According to the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure, a person with hypertension is defined as either having elevated blood pressure (systolic pressure of at least 140 mmHg or diastolic of at least 90 mmHg) or taking antihypertensive medication. You will need to define a categorical variable (hbpx) indicating persons with high blood pressure (100= high blood pressure; 0= no high blood pressure).

Step 2: Create Variable to Subset Population

In order to subset the data in SAS Survey Procedures, you will need to create a variable for the population of interest. In this example, the sel variable is set to 1 if the sample person is 20 years or older, and 2 if the sample person is younger than 20 years. Then this variable is used in the domain statement to specify the population of interest (those 20 years and older).

if ridageyr GE 20 then sel = 1;

else sel = 2;

Step 3:  Use proc surveymeans to generate proportions and their standard errors in SAS Survey Procedures

In SAS Survey Procedures, persons with high blood pressure, as defined above, are assigned a value of 100, and persons without high blood pressure are assigned a value of 0. The weighted mean of sample persons with a value equal to 100 (which will be expressed as a percent) is an estimate of the prevalence of high blood pressure in the U.S.

IMPORTANT NOTE

These programs use variable formats listed in the Tutorial Formats page. You may need to format the variables in your dataset the same way to reproduce results presented in the tutorial.

Generate Proportions in SAS Survey Procedures
Statements Explanation
ods trace on ;

Use the ods statement to provide printer-friendly output.

proc surveymeans data=analysis_Data nobs mean stderr

Use the proc surveymeans procedure to obtain number of observations, mean, and standard error.

stratum sdmvstra;

Use the stratum statement to define the strata variable (sdmvstra).

cluster sdmvpsu;

Use the cluster statement to define the PSU variable (sdmvpsu).

class riagendr age race;

Use the class statement to specify the discrete variables used to form the subpopulations of interest. In this example, the subpopulation of interest are gender (riagendr), age (age), and race/ethnicity (race).

domain sel sel*riagendr*age*race;

Use the domain statement to specify the table layout to form the subpopulations of interest. This example uses age greater than or equal to 20 (sel) by gender (riagendr) by age (age) and by race/ethnicity (race).
var hbpx;

Use the var statement to name the variable(s) to be analyzed. In this example, the high blood pressure variable (hbpx) is used. If the sample person has high blood pressure, then the value equals 100.  If the sample person does not have high blood pressure, then the value equals 0.

IMPORTANT NOTE

The SAS Survey procedure, proc surveymeans, is only able to use the variable coded as 100 and 0.

weight wtmec4yr;

Use the weight statement to account for the unequal probability of sampling and non-response. In this example, the MEC weight for 4 years of data (wtmec4yr) is used.

ods output domain(match_all)=domain;
run;

Use the ods statement to output the dataset of estimates from the subdomains listed on the domain statement above. This set of commands will output two datasets for each subdomain specified in the domain statement above (domain for sel; domain1 for sel*riagendr*age*race).

data all;

set domain domain1;
if sel='Age ge 20';

run;

Use the data statement to name the temporary SAS dataset (all) append the two datasets, created in the previous step, if age is greater than or equal to 20 (sel).

proc print noobs data =all split = '/' ;

var   riagendr age race N mean stderr ;

format n 5.0 mean 4.4 stderr 4.2 ;

label N = 'Sample' / 'size'

mean='Percent'

stderr='Standard' / 'error' / 'of the' / 'percent';

title1 'Percent of adults 20 years and older with high blood pressure, 1999-2002' ;

run ;

Use the print statement to print the number of observations, the mean, and standard error of the mean in a printer- friendly format.

Step 3: Review output

The percents in the output are the proportions of sample persons with high blood pressure:

• Reviewing the output, you will see that the tables of both genders, males only, and females only sorted by age group and then race/ethnicity.
• The " Other" race/ethnicity category is only included to complete the totals. It is not reported.
• In the table for females, notice that the proportion of black females with high blood pressure is twice that of the other races in the 20-39 years age group, and nearly twice that of the other races in the 40-59 years age group.
• Given the low proportion of high blood pressure in the years 20-39 age group, you will also want to consider using an arcsine of Clopper-Pearson transformation for standard error estimation.