In this module, you will generate ageadjusted prevalence rates and standard errors for high blood pressure (HBP) in persons 20 years and older in the United States by sex and race/ethnicity. An optional second example is available demonstrating how to generate ageadjusted means and standard errors for Body Mass Index (BMI) in persons 20 years and older in the United States by sex and race/ethnicity.
To calculate ageadjusted prevalence rates, you will need to know the age standardizing proportions that you want to use, and then apply them to the populations under comparison. This is called the direct method for age standardization. Typically, Census data are used as the standard population structure. For age standardization in NHANES, the National Center for Health Statistics (NCHS) recommends using the 2000 Census population. A spreadsheet with the year 2000 U.S. population structure by age is attached below. Calculate the standard age proportions by dividing the agespecific Census population (P) by the total Census population number (T). The standardizing proportions (P/T) should sum to 1 (see the table below in Step 2 for the standard age proportions used in this module.)
Remember that you need to define the SVYSET before using the SVY series of commands. The general format of this command is below:
svyset [w=weightvar], psu(psuvar) strata(stratavar) vce(variance method)
To define the survey design variables for your high blood pressure analysis, use the weight variable for 4 years of MEC data (wtmec4yr), the PSU variable (sdmvpsu), and strata variable (sdmvstra) .The vce option specifies the method for calculating the variance and the default is "linearized" which is Taylor linearization. Here is the svyset command for 4 years of MEC data:
svyset [w= wtmec4yr], psu(sdmvpsu) strata(sdmvstra) vce(linearized)
For age standardization in NHANES, NCHS recommends using the 2000 Census population. To get the correct Census age distribution, you need to know two things: the age group of interest (e.g. all ages, ages 6 and older, adults 20 and older) and how wide the age strata are for adjustment (e.g. 5 year, 10 year or 20 year age intervals). In general, the more tightly you want to control for age, the narrower the age strata should be.
For your convenience, standard proportions for different NHANES population age groupings are provided in the Excel spreadsheet attached below. This file uses the 2000 Census as the standard population. The adjustment factors were calculated for four age groupings:
For other age groupings, you can combine the smaller age groups provided in order to reflect the age and subpopulation you are using in your analysis.
Standard Proportions for NHANES Population Groupings link: ageadjtwt.xls
Here is an example of how to calculate the standard age proportions by dividing the agespecific Census population (P) by the total Census population number (T). The standardizing proportions should sum to 1.
Age Group  AgeSpecific Census Population (in thousands) 
Total Census
Population (in thousands) 


P  T  P/T  
2039  77,670  195,850  .396579 
4059  72,816  195,850  .371795 
60+  45,364  195,850  .231626 
Total:  195,850  Sum:  1 
As you can see each "standard age proportion", also referred to as "age adjustment weight", is simply the proportion of people in the 2000 Census  the standard population  in a specific age category. For example, the standard age proportion for people 2039 years old is:
Klein RJ, Schoenborn, CA. Age Adjustment using the 2000 projected U.S. population. Healthy People Statistical Notes, no. 20. Hyattsville, Maryland: National Center for Health Statistics. January 2001.
You will need to create variables for age, race, standard weight, and high blood pressure. First, create variables to apply the age standard proportions and race/ethnicity groups to your analyses. Next, assign the census proportion for each corresponding age strata using the std_wgt variable. This variable is usually referred to as standard weight in statistical manuals. Finally, code the outcome variable a dichotomous variable, where the absence of of the outcome is coded as 0 and the presence of the outcome is coded as 100. Using 100 will express the proportion as a percentage (e.g., 0.23 would be represented as 23). The dichotomous variable, hbp, is already coded as 2 for the absence of outcome and 1 for the presence of outcome, so it will need to be recoded as a new variable, hpbx. Here is the code for creating the variables:
Variable  Code to generate variables 

Age 
gen age=1 if
ridageyr >=20 & ridageyr <40 
Race 
gen race =1 if
ridreth1 == 3 
Standard Weight 
gen std_wgt=.3966 if
age==1 
High Blood Pressure 
gen hbpx=100 if
hbp==1 
The Stata command, svy: mean, is used to generate ageadjusted proportions and standard errors. Using svy:mean is not a mistake  a proportion is the mean of a dichotomous variable.
The general form of the command is just like the mean command from descriptive statistics but uses the stdize and stdweight options.
svy, subpop(condition): mean depvar, stdize(agevar) stdweight(ageweightvar)
Here is the STATA command and output for the ageadjusted prevalence of high blood pressure and standard errors for men and women age 20 years and older:
svy, subpop(if ridageyr >=20 & ridageyr <.): mean hbpx, stdize(age) stdweight(std_wgt) over(riagendr)
And here is the STATA command and output for the ageadjusted prevalence of high blood pressure and standard errors for race (nonHispanic white, nonHispanic black, Mexican American and other) age 20 years and older.
svy, subpop(if ridageyr >=20 & ridageyr <.): mean hbpx, stdize(age) stdweight(std_wgt) over(race)
To calculate the unadjusted prevalence, use the program code above, EXCEPT DO NOT USE the stdize and stdweight options.
To understand how much age standardization matters, it is helpful to compare the estimates from the crude and age adjusted analyses. The following table summarizes the results:
Variable  Mean Age  % with hypertension  Standard error 

Gender 

Male 
45  27%  1.22 
Female 
47  31%  1.07 
Race 

NonHispanic white  48  30%  1.15 
NonHispanic black  43  37%  1.51 
Mexican American  38  17%  1.21 
Other  43  28%  2.30 
Variable  Mean Age  % with hypertension  Standard error 

Gender  
Male  45  28%  1.21 
Female  47  30%  0.71 
Race  
NonHispanic white  48  28%  0.97 
NonHispanic black  43  41%  1.05 
Mexican American  38  26%  0.98 
Other  43  31%  2.25 
Highlights from the output include: