A SAS Program for the 2000 CDC Growth Charts (ages 0 to <20 years)
Note that the BIV cut points were changed in 2016. These changes did not affect the calculation of any of the zscores or percentiles, or the subsequent calculation of overweight or obesity.
The purpose of this SAS program is to calculate the percentiles and zscores (standard deviations) for a child’s sex and age for BMI, weight, height, and head circumference based on the CDC growth charts. Weightforheight percentiles and zscores are also calculated. Observations that contain extreme values are flagged as being biologically implausible. These extreme values, however, are not necessarily incorrect.
Although the SAS program can be used to calculate zscores and percentiles for children up to 20 years of age, the World Health Organization (WHO) growth charts are recommended for children <24 months of age. There are several computer programs available on the WHO and CDC sites that use the WHO growth charts; the latter follows the same steps as does this SAS program for the CDC growth charts.
The SAS program, cdcsourcecode.sas [SAS  8KB] (files are below, in step #1), calculates these zscores and percentiles for children in your data based on reference data in cdc_ref.sas7bdat. If you’re not using SAS, you can download CDCref_d.csv [CVS 160KB], and create a program based on cdcsourcecode.sas [SAS8KB] to do the calculations.
Instructions for SAS users
Step 1: Download the SAS program (cdcsourcecode.sas [SAS 8KB]) and the reference data file (CDCref_d.sas7bdat). Do not alter these files, but move them to a folder (directory) that SAS can access.
For the following example, the files have been saved in c:\sas\growth charts\cdc\data.
Step 2: Create a libname statement in your SAS program to point at the folder location of ‘CDCref_d.sas7bdat’. An example would be:
libname refdir 'c:\sas\growth charts\cdc\data';
Note the SAS code expects this name to be refdir; do not change this name.
Step 3: Set your existing dataset containing height, weight, sex, age and other variables into a temporary dataset, named mydata. Variables in your dataset should be renamed and coded as follows:
Table 1
Variable  Description 

agemos  Child's age in months; must be present. The program assumes you know the number of months to the nearest day based on the dates of birth and examination. For example, if a child was born on Oct 1, 2007 and was examined on Nov 15, 2011, the child’s age would be 1506 days or 49.48 months. In everyday usage, this age would be stated as 4 years or as 49 months. However, if 49 months were used as the age of all children who were between 49.0 and <50 months in your data, the estimated zscores would be slightly too high because, on average, these children would be taller, weigh more, and have a higher BMI than children who are exactly 49.0 months of age. This bias would be greater if only completed years of age were known, and the age of all children between 4 and <5 years was represented as 48 months. If age is known only as the completed number of months (as is data from NHANES 19881994 and 19992010), consider adding 0.5 so that the maximum error would be 15 days. If age is given as the completed number of years, multiply by 12 and consider adding 6. 
sex  Coded as 1 for boys and 2 for girls. 
height  Height in cm. This is either standing height (for children who are ≥ 24 months of age or recumbent length (for children < 24 months of age); both are input as height. If standing height was measured for some children less than 24 months of age, you should add 0.8 cm to these values (see page 8 of http://www.cdc.gov/nchs/data/series/sr_11/sr11_246.pdf [PDF5.4MB]). If recumbent length was measured for some children who are ≥ 24 months of age, subtract 0.8 cm. 
weight  Weight (kg) 
bmi  BMI (Weight (kg) /Height (m)^{2}). If your data doesn’t contain BMI, the program calculates it. If BMI is present in your data, the program will not overwrite it. 
headcir  Head circumference (cm) 
Zscores and percentiles for variables that are not in mydata will be coded as missing (.) in the output dataset (named _cdcdata). Sex (coded as 1 for boys and 2 for girls) and agemos must be in mydata. It’s unlikely that the SAS code will overwrite other variables in your dataset, but you should avoid having variable names that begin with an underscore, such as _bmi.
Step 4: Copy and paste the following line into your SAS program after the line (or lines) in step #3.
%include 'c:\sas\growth charts\cdc\data\CDCsourcecode.sas'; run;
If necessary, change this statement to point at the folder containing the downloaded ‘CDCsourcecode.sas’ file. This tells your SAS program to run the statements in ‘CDCsourcecode.sas’.
Step 5: Submit the %include statement. This will create a dataset, named _cdcdata, which contains all of your original variables along with zscores, percentiles, and flags for extreme values. The names and descriptions of these new variables in _cdcdata are in Table 2. Additional information on the extreme zscores is given in a separate section that follows the “Example SAS Code”.
Table 2: ZScores, percentiles, and extreme (biologically implausible, BIV) values in output dataset, _cdcdata
Description 
Variable 
Cutoff for Extreme ZScores 


Percentile 
Zscore 
Modified Zscore to Identify Extreme Values 
Flag for Extreme 
Low zscore (Flag coded as 1) 
High zscore (Flag coded 

Weightforage for children between 0 and 239 (inclusive) months of age 
wapct 
waz 
_Fwaz 
_bivwt 
< 5 
> 8* 
Heightforage for children between 0 and 239 (inclusive) months of age. 
hapct 
haz 
_Fhaz 
_bivht 
< 5 
>4* 
Weightforheight for children with heights between 45 and 121 cm (this height range approximately covers ages 0 to 6 y) 
whpct 
whz 
_Fwhz 
_bivwh 
< 4 
>8* 
BMIforage for children between 24 and 239 months of age 
bmipct 
bmiz 
_Fbmiz 
_bivbmi 
< 4 
>8* 
Head circumferenceforage for children between 0 and 35 (inclusive) months of age 
headcpct 
headcz 
_Fheadcz 
_bivhc 
< 5 
>5 
* Changed in 2016. Additional information is below
Step 6: Examine the new dataset, _cdcdata, with PROC MEANS or some other procedure to verify that the zscores and other variables have been created. If a variable in Table 1 was not in your original dataset (e.g., head circumference), the output dataset will indicate that all values for the percentiles and zscores of this variable are missing. If values for other variables are unexpectedly missing, make sure that you’ve renamed and recoded variables as indicated in Table 1 and that your SAS dataset is named mydata. The program should not modify your original data, but will add new variables to your original dataset.
Example SAS code corresponding to steps 2 to 6. You can simply cut and paste these lines into a SAS program, but you’ll need to change the libname and %include statements to point at the folders containing the downloaded files.
libname refdir 'c:\sas\growth charts\cdc\data';
data mydata; set whateveryouroriginaldatasetisnamed;
%include 'c:\sas\growth charts\cdc\data\CDCsourcecode.sas';
proc means data=_cdcdata; run;
Additional Information
Zscores are calculated as =
Z = [ ((value / M)**L) – 1] / (S * L) ,
in which ‘value’ is the child’s BMI, weight, height, etc. The L, M, and S values are in CDCref_d.sas7bdat and vary according to the child’s sex and age or according to the child’s sex and height. Percentiles are then calculated from the zscores (for example, a zscore of 1.96 would be equal to the 97.5 percentile). For more information on the LMS method, see http://www.ncbi.nlm.nih.gov/pmc/articles/PMC27365/
Extreme or Biologically implausible Values
As explained in the Modified zscores documentation [PDF  367KB] , the SAS code also calculates modified zscores that can be used to identify extreme values that may be errors. These modified zscores are based on extrapolating onehalf of the distance between 0 and +2 zscores to the tails of the distribution. The output from the SAS program contains BIV flag variables that are coded as 1 (modified zscore is extremely low), +1 (modified zscore is extremely high), or 0 (modified zscore is between these 2 cutpoints). These BIVs flags (e.g., _bivbmi), along with other variables that are in the output dataset, _cdcdata, are shown in Table 2. A biologically implausible value is not necessarily incorrect, but the value should be further examined, possibly in conjunction with other characteristics of the child.
2016 Change to BIV cutpoints: Rationale
The modified zscores used for the upper range of valid values was changed in 2016 for a number of the growth chart parameters. Previously, the cutpoints for extremely high values were based on recommendations from a 1995 WHO publication (1), but several papers (2–6) have since indicated that these cutpoints were probably too restrictive. The WHO cutpoints identified many values that were extremely high, but were probably not errors.
On the basis of an analyses of 2 to 18yearolds in NHANES 19992000 through 20112012 (3) and 2 to 4yearolds in CDC’s Pediatric Nutrition Surveillance System (PedNSS) (6), we now suggest that the upper BIV cut points be increased from
(1) +5 to +8 for modified zscores for weight and BMI, and
(2) +3 to +4 for modified zscores for height.
These new zscore cutpoints roughly correspond to the modified zscores for the maximum values of the body size measures among 2 to 18yearolds in NHANES. We are not making changes to the cutpoints for the extremely low values of the body size measurements.
If BIV cutpoints are used to exclude data, this change would likely affect comparisons of data calculated and cleaned using these new BIV cutpoints with data that used the older (WHO 1995) values. The effects of these changes will likely differ across datasets depending upon the true prevalence of extreme values and the accuracy of the recorded data. In an analysis (3) of NHANES 19992012 data, for example, as compared with estimates obtained using the WHO 1995 cutpoints, the use of the 2016 cutpoints increased the prevalence of obesity and extreme obesity (120% of the 95th percentile of BMIforage) by about 0.5 percentage points. (Because of the extensive data cleaning in NHANES, published estimates from these surveys do not exclude any of the extremely high values.) In an analysis of PedNSS (6), compared with the WHO 1995 cutpoints, the use of the 2016 cut points increased the prevalence of both obesity and extreme obesity by 0.9 percentage points. Because of the relatively low prevalence of extreme obesity among children, particularly preschool children, a 0.5% to 0.9% increase results in a large proportional change in prevalence.
BIVs vs. Data Errors
These BIVs can be used to flag potentially problematic data points, and the 2016 cutpoints were chosen to balance the inclusion of extreme values that are likely to be correct and the exclusion of those that are likely to be incorrect. However, other cutpoints can be used and may be more appropriate based on other information specific to your data. If desired, the modified zscores (3rd column of Table 2) can be used to construct other cutpoints for extreme (or biologically implausible) values rather than relying on the BIV flag variables. For example, if you feel that use of the BMIforage cutpoint of +8 would result in the inclusion of many values that are likely to be errors, you could use F_bmiz > 6 as the definition of a high BMI BIV. This could be recoded in the output dataset as:
if 5 <= _Fbmiz <= 6 then _bivbmi=0; *plausible;
else if _Fbmiz > 6 then _bivbmi=1; *high BIV;
else if . < _Fbmiz < 5 then _bivbmi= 1; *low BIV;
It would also be possible to use the modified zscores to identify children who would have been flagged with the older WHO cutpoints.
Once a data point has been flagged as a potential problem, other information from the child, if available, could be used to help identify errors and help in the decision to include or exclude the value. For example, if a child with an extremely high BMI also has a high skinfold thickness or arm circumference, the BMI value is more likely to be correct than if the other measure is low. Similarly, in a longitudinal study, one could assess whether a child with an extreme value at 1 time point also has a high value at other examinations. If only weight and height are available at a single examination, one might consider whether a child who has an extremely high weight is also very tall, and if there are other children who weigh nearly as much.
Defining Extreme Obesity (the 99th percentile of BMIforage)
The use of the LMS parameters of the CDC growth charts has been shown to result in inaccurate estimates of the empirical percentiles at very high BMI values (e.g., the 99th percentile) http://www.ajcn.org/content/90/5/1314.full.pdf [PDF  154KB]. Therefore, rather than using the BMIforage percentiles (and zscores) to identify and track children who are extremely obese, it is recommended that these high BMI values be expressed as a percentage of the 95th percentile. A BMI value that is 20% greater than the 95th percentile (relative to the CDC reference population) is approximately equal to the 99th percentile of the reference population.
The SAS code creates a variable, bmipct95, to simplify the use of this definition. This variable expresses a child’s BMI as a percentage of the 95th percentile for that child’s sex and age. Bmipct95 can range from <50 (for very thin children) to >220 (for very heavy children). A child with a bmipct95 of 100 is at the 95th percentile of BMIforage. A value of 120 would indicate that the child’s BMI is 20% greater than the 95th percentile.
References
 WHO Expert Committee. Physical status: the use and interpretation of anthropometry. WHO Tech. Rep. Ser. 1995;pages 217 to 250.
 Lawman HG, Ogden CL, Hassink S, et al. Comparing methods for identifying biologically implausible values in height,weight, and Body Mass Index among youth. Am. J. Epidemiol. 2015;182(4):359–65.
 Freedman DS, Lawman HG, Skinner AC, et al. Validity of the WHO cutoffs for biologically implausible values of weight, height, and BMI in children and adolescents in NHANES from 1999 through 2012. Am. J. Clin. Nutr. 2015;102(5):1000–6.
 Lo JC, Maring B, Chandra M, et al. Prevalence of obesity and extreme obesity in children aged 35 years. Pediatr. Obes. 2014;9(3):167–75.
 Dennison BA, Edmunds LS, Stratton HH, et al. Rapid infant weight gain predicts childhood overweight. Obesity (Silver Spring). 2006;14(3):491–9.
 Freedman DS, Lawman HG, Pan L, et al. The prevalence and validity of high, biologically implausible values of weight, height and BMI among 8.8 million children. Obes. (Silver Spring). 2016;Mar 17. PMID 26991694.
 Page last reviewed: October 27, 2016
 Page last updated: October 27, 2016
 Content source: