A SAS Program for the 2000 CDC Growth Charts (ages 0 to <20 y)
The purpose of this SAS program is to calculate the percentiles and zscores (standard deviations) for a child’s sex and age for BMI, weight, height, and head circumference based on the CDC growth charts. Weightforheight percentiles and zscores are also calculated. Observations that contain extreme values are flagged as being biologically implausible. These extreme values, however, are not necessarily incorrect.
Although the SAS program can be used to calculate zscores and percentiles for children up to 20 years of age, the World Health Organization (WHO) growth charts are recommended for children <24 months of age. There are several computer programs available on the WHO and CDC sites that use the WHO growth charts; the SAS program for the WHO growth charts follows the same steps as does this SAS program for the CDC growth charts.
The SAS program, cdcsourcecode.sas (files are below, in step #1), calculates these zscores and percentiles for children in your data based on reference data in CDC_ref.sas7bdat. If you’re not using SAS, you can download CDCref_d.csv, and create a program based on CDCsourcecode.sas to do the calculations.
Instructions for SAS users
Step 1: Download the SAS program (cdcsourcecode.sas) and the reference data file (CDCref_d.sas7bdat). Do not alter these files, but move them to a folder (directory) that SAS can access.
If you are using Chrome or Firefox, right click to save the cdcsourcecode.sas
file.
For the following example, the files have been saved in c:\sas\growth charts\cdc\data.
Step 2: Create a libname statement in your SAS program to point at the folder location of ‘CDCref_d.sas7bdat’. An example would be:
libname refdir 'c:\sas\growth charts\cdc\data';
Note the SAS code expects this name to be refdir; do not change this name.
Step 3: Set your existing dataset containing height, weight, sex, age and other variables into a temporary dataset, named mydata. Variables in your dataset should be renamed and coded as follows:
Table 1
Variable  Description 

agemos 
Child's age in months; must be present. The program assumes you know the number of months to the nearest day based on the dates of birth and examination. For example, if a child was born on Oct 1, 2007 and was examined on Nov 15, 2011, the child’s age would be 1506 days or 49.48 months. In everyday usage, this age would be stated as 4 years or as 49 months. However, if 49 months were used as the age of all children who were between 49.0 and <50 months in your data, the estimated zscores would be slightly too high because, on average, these children would be taller, weigh more, and have a higher BMI than children who are exactly 49.0 months of age. This bias would be greater if only completed years of age were known, and the age of all children between 4 and <5 years was represented as 48 months. 
sex 
Coded as 1 for boys and 2 for girls. 
height 
Height in cm. This is either standing height (for children who are ≥ 24 months of age or recumbent length (for children < 24 months of age); both are input as height. If standing height was measured for some children less than 24 months of age, you should add 0.8 cm to these values (see page 8 of http://www.cdc.gov/nchs/data/series/sr_11/sr11_246.pdf). If recumbent length was measured for some children who are ≥ 24 months of age, subtract 0.8 cm. 
weight 
Weight (kg) 
bmi 
BMI (Weight (kg) /Height (m)^{2}). If your data doesn’t contain BMI, the program calculates it. If BMI is present in your data, the program will not overwrite it. 
headcir 
Head circumference (cm) 
Zscores and percentiles for variables that are not in mydata will be coded as missing (.) in the output dataset (named _cdcdata). Sex (coded as 1 for boys and 2 for girls) and agemos must be in mydata. It’s unlikely that the SAS code will overwrite other variables in your dataset, but you should avoid having variable names that begin with an underscore, such as _bmi.
Step 4: Copy and paste the following line into your SAS program after the line (or lines) in step #3.
%include 'c:\sas\growth charts\cdc\data\CDCsourcecode.sas'; run;
If necessary, change this statement to point at the folder containing the downloaded ‘CDCsourcecode.sas’ file. This tells your SAS program to run the statements in ‘CDCsourcecode.sas’.
Step 5: Submit the %include statement. This will create a dataset, named _cdcdata, which contains all of your original variables along with zscores, percentiles, and flags for extreme values. The names and descriptions of these new variables in _cdcdata are in Table 2. Additional information on the extreme zscores is given in a separate section that follows the “Example SAS Code”.
Table 2: ZScores, percentiles, and extreme (biologically implausible, BIV) values in output dataset, _cdcdata
Description  Variable 
Cutoff for Extreme ZScores 


Percentile 
Zscore 
Modified Zscore to Identify Extreme Values 
Flag for Extreme 
Low zscore (Flag coded as 1) 
High zscore (Flag coded 

Weightforage for children between 0 and 239 (inclusive) months of age 
wapct 
waz 
_Fwaz 
_bivwt 
< 5 
> 5 
Heightforage for children between 0 and 239 (inclusive) months of age. 
hapct 
haz 
_Fhaz 
_bivht 
< 5 
>3 
Weightforheight for children with heights between 45 and 121 cm (this height range approximately covers ages 0 to 6 y) 
whpct 
whz 
_Fwhz 
_bivwh 
< 4 
>5 
BMIforage for children between 24 and 239 months of age 
bmipct 
bmiz 
_Fbmiz 
_bivbmi 
< 4 
>5 
Head circumferenceforage for children between 0 and 35 (inclusive) months of age 
headcpct 
headcz 
_Fheadcz 
_bivhc 
< 5 
>5 
Step 6: Examine the new dataset, _cdcdata, with PROC MEANS or some other procedure to verify that the zscores and other variables have been created. If a variable in Table 1 was not in your original dataset (e.g., head circumference), the output dataset will indicate that all values for the percentiles and zscores of this variable are missing. If values for other variables are unexpectedly missing, make sure that you’ve renamed and recoded variables as indicated in Table 1 and that your SAS dataset is named mydata. The program should not modify your original data, but will add new variables to your original dataset.
Example SAS code corresponding to steps 2 to 6. You can simply cut and paste these lines into a SAS program, but you’ll need to change the libname and %include statements to point at the folders containing the downloaded files.
libname refdir 'c:\sas\growth charts\cdc\data';
data mydata; set whateveryouroriginaldatasetisnamed;
%include 'c:\sas\growth charts\cdc\data\CDCsourcecode.sas';
proc means data=_cdcdata; run;
Additional Information
Zscores are calculated as =
Z = [ ((value / M)**L) – 1] / (S * L) ,
in which ‘value’ is the child’s BMI, weight, height, etc. The L, M, and S values are in CDCref_d.sas7bdat and vary according to the child’s sex and age or according to the child’s sex and height. Percentiles are then calculated from the zscores (for example, a zscore of 1.96 would be equal to the 97.5 percentile). For more information on the LMS method, see http://www.ncbi.nlm.nih.gov/pmc/articles/PMC27365/
Extreme or Biologically implausible Values
The SAS code also flags extreme values (biologically implausible values, or BIVs). As explained in the BIV cutoffs documentation, these BIVs are based on modified zscores that were calculated using a different method. These BIV flag variables are coded as 1 (modified zscore is extremely low), +1 (modified zscore is extremely high), or 0 (modified zscore is between these 2 cutpoints). These BIVs flags, along with other variables that are in the output dataset, _cdcdata, are shown in Table 2.
The modified zscores (3rd column of Table 2) can be used to construct other cutpoints for extreme (or biologically implausible) values. For example, if the distribution of BMI is strongly skewed to the right, you might use F_bmiz > 8 (rather than 5) as the definition of an extremely high BMIforage. This could be recoded as:
if 5 <= _Fbmiz <= 8 then _bivbmi=0; *plausible;
else if _Fbmiz > 8 then _bivbmi=1; *high BIV;
else if . < _Fbmiz < 5 then _bivbmi= 1; *low BIV;
There are also 2 overall indicators of extreme values in the output dataset: _bivlow and _bivhigh. These 2 variables indicate whether any measurement is extremely high (_bivhigh=1) or extremely low (_bivlow=1). If a child does not have an extreme value for any measurement, both variables are coded as 0. A biologically implausible value is not necessarily incorrect, but the value should further studied, possibly in conjunction with other characteristics of the child. For example, if a child’s weight is implausibly high, is the child also very tall and are there other children who weigh nearly as much?
Defining Extreme Obesity (the 99th percentile of BMIforage)
The use of the LMS parameters of the CDC growth charts has been shown to result in inaccurate estimates of the empirical percentiles at very high BMI values (e.g., the 99th percentile) http://www.ajcn.org/content/90/5/1314.full.pdf. Therefore, rather than using the BMIforage percentiles (and zscores) to identify and track children who are extremely obese, it is recommended that these high BMI values be expressed as a percentage of the 95th percentile. A BMI value that is 20% greater than the 95th percentile (relative to the CDC reference population) is approximately equal to the 99th percentile of the reference population.
The SAS code creates a variable, bmipct95, to simplify the use of this definition. This variable expresses a child’s BMI as a percentage of the 95th percentile for that child’s sex and age. Bmipct95 can range from <50 (for very thin children) to >220 (for very heavy children). A child with a bmipct95 of 100 is at the 95th percentile of BMIforage. A value of 120 would indicate that the child’s BMI is 20% greater than the 95th percentile.
Contact Us:
 Centers for Disease Control and Prevention
Division of Population Health
4770 Buford Hwy, NE
MS K–46
Atlanta, GA 303413717  800CDCINFO
(8002324636)
TTY: (888) 2326348
 Contact CDCINFO