In this task, you will generate age-adjusted prevalence rates and standard errors for high blood pressure (HBP) by sex and race in persons 20 years and older. An optional second example is available demonstrating how to generate age-adjusted means and standard errors for Body Mass Index (BMI) by sex and race/ethnicity for persons 20 years and older.
To calculate age-adjusted prevalence rates, you will need to know the age standardizing proportions that you want to use, and then apply them to the populations under comparison. This is called the direct method for age standardization. Typically, Census data are used as the standard population structure. For age standardization in NHANES, NCHS recommends using the 2000 Census population. A spreadsheet with the year 2000 U.S. population structure by age is attached below. The standard age proportions are calculated by dividing the age-specific Census population (P) by the total Census population number (T). The standardizing proportions (P/T) should sum to 1 (please see the table below for the standard age proportions used in this module.)
For your convenience, standard proportions for different NHANES population age groupings are provided in the Excel spreadsheet attached below. This file uses the 2000 Census as the standard population. The adjustment factors were calculated for four age groupings:
For other age groupings, you can combine the smaller age groups provided in order to reflect the age and subpopulation you are using in your analysis.
Standard Proportions for NHANES Population Groupings link: ageadjtwt.xls
Here is an example of how to calculate the standard age proportions by dividing the age-specific Census population (P) by the total Census population number (T). The standardizing proportions should sum to 1.
Age Group | Age-Specific Census Population (in thousands) |
Total Census
Population (in thousands) |
Standard Age Proportions |
---|---|---|---|
P | T | P/T | |
20-39 | 77,670 | 195,850 | .396579 |
40-59 | 72,816 | 195,850 | .371795 |
60+ | 45,364 | 195,850 | .231626 |
Total: | 195,850 | Sum: | 1 |
As you can see each "standard age proportion", also referred to as “age adjustment weight”, is simply the proportion of people in the 2000 Census - the standard population - in a specific age category. For example, the standard age proportion for people 20-39 years old is:
Klein RJ, Schoenborn, CA. Age Adjustment using the 2000 projected U.S. population. Healthy People Statistical Notes, no. 20. Hyattsville, Maryland: National Center for Health Statistics. January 2001.
Link to %sregsub macro on SAS website: http://support.sas.com/ctx/samples/index.jsp?sid=483
You will recode the discrete variable, hbp, as (0, 100), for absence (0) or presence (100) of the health condition of interest, in order for SAS Survey Procedures to use the %sregsub macro procedure.
if hbp=1 then hbpx=100;
if hbp=2 then hbpx=0;
run;
The SAS Survey macro, %sregsub, is used to generate age-adjusted percentages (prevalence rates) and standard errors. The SAS Survey program used to obtain weighted age-adjusted prevalence rates and standard errors for high blood pressure by race, among persons 20 years and older follows here.
These programs use variable formats listed in the Tutorial Formats page. You may need to format the variables in your dataset the same way to reproduce results presented in the tutorial. |
Statements | Explanation |
---|---|
%include 'C:\NHANES\sample00483_1_sregsub.sas.txt'; | Use the %include function to include the macro text file downloaded in Step 1. In this example, the file is named sample00483_1_sregsub.sas.txt and is saved in the C:\NHANES\ directory. |
%SREGSUB( |
Use the %sregsub statement to name and open the macro. |
DATA= analysis_data, | Use the data statement to identify the dataset (analysis_data). |
STRATA= sdmvstra, | Use the strata statement to specify the strata (sdmvstra) and account for design effects of stratification. |
CLUSTER= sdmvpsu, | Use the cluster statement to specify PSU (sdmvpsu) to account for design effects of clustering. |
CLASS= race, age, | Use the class statement to specify the discrete variables used to select the subpopulations of interest (i.e., race [race] and age [age]). |
WEIGHT= wtmec4yr, | Use the weight statement to account for the unequal probability of sampling and non-response. In this example, the MEC weight for 4 years of data (wtmec4yr) is used. |
MODEL= hbpx=race age race*age /noint solution, |
Use a model statement with the noint option to produce HBP means for the 12 possible race and age combinations (note that race has four groups and age has three groups so multiplying these together equal a total of 12 groups). The solution option produces a printed version of the age-adjusted prevalences. |
ESTIMATE = 'NH White' race 1 0 0 0 age .3966 .3718 .2316 race*age .3966 .3718 .2316 0 0 0 0 0 0 0 0 0, |
Use the estimate statement to produce the age-adjusted prevalence of HBP for non-Hispanic whites. Please refer to the estimate statement in the SAS Manual for more information about using vectors. The vector (vectors are location indicators) 1 0 0 0 points to the non-Hispanic whites; the vectors .3966, .3718 and .2316 correspond to the proportion of 20-39 , 40-59, and 60+ years adults in the U.S. population (Klein and Schoenborn, 2001). |
ESTIMATE2 ='NH Black' race 0 1 0 0 age .3966 .3718 .2316 race*age 0 0 0 .3966 .3718 .2316 0 0 0 0 0 0, |
Use the estimate statement to produce the age-adjusted prevalence of HBP for non-Hispanic blacks. The vector 0 1 0 0 points to the non-Hispanic blacks; the vectors .3966, .3718 and .2316 correspond to the proportion of 20-39 , 40-59, and 60+ years adults in the U.S. population (Klein and Schoenborn. 2001). |
ESTIMATE3 = 'Mex Amer' race 0 0 1 0 age .3966 .3718 .2316 race*age 0 0 0 0 0 0 .3966 .3718 .2316 0 0 0, | Use the estimate statement to produce the age-adjusted prevalence of HBP for Mexican-Americans. The vector 0 0 1 0 points to the Mexican-Americans; the vectors .3966, .3718 and .2316 correspond to the proportion of 20-39 , 40-59, and 60+ years adults in the U.S. population (Klein and Schoenborn, 2001). |
SUBPOP= ridageyr >= 20, |
Use the subpop statement to select those 20 years and older. |
TITLE= 'Age-standardized prevalence of persons 20 years and older with high blood pressure: NHANES 1999-2002', |
Use the title statement to label the output. |
OUTPUT=
estimates=ageadj_prev1 ); |
Use the output statement to output a SAS dataset that contains the estimates. |
proc
print
data=ageadj_prev1; var estimatelabel estimate stderr; title 'Age-standardized prevalence of persons 20 years and older with high blood pressure: NHANES 1999-2002'; run; |
Use the proc print procedure to print the estimate and standard error. |
Note: Program code to produce age-adjusted estimates by race-ethnicity is provided above. To see program code to produce age-adjusted estimates by race-ethnicity and gender and for gender only, please go to the Sample Code and Datasets page to download the programs. |
The code for estimating the crude (unadjusted) prevalence for HBP by race/ethnicity and gender follows:
These programs use variable formats listed in the Tutorial Formats page. You may need to format the variables in your dataset the same way to reproduce results presented in the tutorial. |
Statements | Explanation |
---|---|
proc surveymeans data=analysis_data mean nobs stderr; | Use the proc surveymeans procedure to obtain number of observations, mean, standard error and confidence intervals. |
strata sdmvstra; | Use the stratum statement to define the strata variable (sdmvstra). |
cluster sdmvpsu; | Use the cluster statement to define the PSU variable (sdmvpsu). |
class riagendr race; | Use the class statement to specify the discrete variables used to select the subpopulations of interest (i.e., gender [riagendr] and race [race]). |
weight wtmec4yr; |
Use the weight statement to account for the unequal probability of sampling and non-response. In this example, the MEC weight for 4 years of data (wtmec4yr) is used. |
var hbpx; |
Use the var statement to specify which variable(s) will be analyzed. In this example, the high blood pressure variable (hbpx) is used. |
domain sel sel*riagendr sel*race sel*riagendr*race; | Use the domain statement to specify the subpopulations of interest. |
ods
OUTPUT
domain(match_all)=unadj; run; |
Use the ods statement to output the SAS dataset of estimates from the subdomains listed on the domain statement. This set of commands will output four datasets for each domain specified in the domain statement above (unadj for sel unadj1 for sel*riagendr, unadj2 for sel*race, and undadj3 for sel*riagendr*race). |
data
stats;
set unadj unadj1 unadj2 unadj3; if sel=1; |
Use the data statement to name the temporary SAS dataset (stats) append the four datasets, created in the previous step, if age is greater than or equal to 20 (sel). |
proc
print; var race riagendr n mean stderr; run; |
Use the print statement to print the number of observations, the mean, and standard error of the mean in a printer-friendly format. |
Highlights from the output include: