The SAS Program for CDC Growth Charts that Includes the Extended BMI Calculations
(Updated Jan 9, 2023)
Released December 15, 2022, these charts extend to a BMI of 60
Overview
Note that the calculations for BMI zscores and percentiles for 2 to 19yearolds with obesity (BMI ≥ 95^{th} percentile for a child’s sex and age) have changed on Dec 15, 2022. See the section on the extended BMI percentiles and zscores for more information.
This SAS program calculates percentiles and zscores (standard deviations) for a child’s sex and age for BMI, weight, height, and head circumference from the CDC growth charts (1). In addition, weightforheight zscores and percentiles are also calculated. The program also allows for the identification of outliers. These extreme values, however, are not necessarily incorrect and could be reviewed for possible inclusion or exclusion.
Although the SAS program calculates zscores and percentiles for children up to 20 years of age, the World Health Organization (WHO) growth charts are recommended for children < 24 months of age. Several programs on the WHO and CDC websites are based on the WHO growth charts.
Note that the calculations for BMI zscores and percentiles for 2 to 19yearolds with obesity (>= 95^{th} percentile (1.645 zscore)) changed on Dec 15, 2022, to use extended BMIz. See the section on the extended BMI percentiles and zscores for more information.
The SAS program, cdcsourcecode (files are below, in step #1), calculates these zscores and percentiles for children in your data from the reference data in cdc_ref.sas7bdat for children without obesity and extended BMI percentiles and zscores for children with obesity. Note that the zscores and percentiles calculated for children with obesity will differ from earlier (pre2022) versions of this SAS program. If you’re not using SAS or R, you can download CDCref_d.csv and create a program based on cdcsourcecode.sas.
Instructions for SAS users
Step 1: Download the SAS program (cdcsourcecode.sas) and the reference data file (CDCref_d.sas7bdat). Move these files to a folder (directory) that SAS can access. For the following example, the files are in c:\sas_growth_charts.
Example SAS code corresponding to Steps 2 to 6 below. After downloading the SAS code and the reference data, you can cut and paste the following 4 lines into your SAS program. But you’ll likely need to change the libname and %include statements to point at the folder/directory for the downloaded files. You’ll also probably have to rename and recode variables, as explained in Steps 2 to 6.
libname refdir 'c:\sas_growth_charts'; data mydata; set whateveryouroriginaldatasetisnamed; %include ‘c:\sas_growth_charts\cdcsourcecode.sas’; proc means data=_cdcdata; run;
Step 2: Create a libname statement in your SAS program to point at the folder location of ‘CDCref_d.sas7bdat’. An example would be:
libname refdir ‘c:\sas_growth_charts’;
Note: the SAS code expects this name to be refdir – make sure you specify this in the libname statement.
Step 3: Set your existing data that contains height, weight, sex, age, and other variables into a temporary dataset named mydata. Rename and code the variables as follows (Table 1):
Variable  Description of variables and coding in the input dataset, mydata 

agemos  Months of age. Agemos must be in your dataset, and the program assumes that you know the number of months to the nearest day. For example, if a child were born on Oct 1, 2007, and examined on Nov 15, 2011, the child’s age would be 1506 days or 49.48 (1506 / 30.4375) months. In everyday usage, this child’s age would be 4 years or 49 months. However, if 49 months were used for all children between 49.0 and < 50 months of age, then most of the calculated zscores would be too high because, on average, these children would be taller and heavier than children who are 49.0 months of age. If only the completed number of months is known (as in NHANES), add 0.5 to the age so that the maximum error would be 15 days. If age represents the completed years (e.g., 13 years), multiply by 12 and add 6. If age is in days, divide by 30.4375. 
sex  Sex must be coded as 1 for boys and 2 for girls. 
height  Height (cm). Height is either standing height (for children ≥ 24 months of age or recumbent length (< 24 months). If standing height was measured for children under 24 months of age, you should add 0.8 cm to these values (see page 8 of https://www.cdc.gov/nchs/data/series/sr_11/sr11_246.pdf [PDF5.3MB]). If recumbent length was measured for children ≥ 24 months, subtract 0.8 cm. 
weight  Weight (kg) 
bmi  BMI [Weight (kg) / Height (m)^{2}]. The program calculates BMI if it is not present in your data but will not overwrite BMI if present. 
headcir  Head circumference (cm) 
Zscores and percentiles for the anthropometric variables not in mydata (or are that are missing) will be coded as missing (.) in the output dataset, _cdcdata. It’s unlikely that the SAS code will overwrite variables in your dataset, but you should avoid having variable names that begin with an underscore or with ‘mod_’
Step 4: Copy and paste the following line into your SAS program after the line (or lines) in Step #3.
%include ‘c:\sas_growth_charts\cdcsourcecode.sas’; run;
If necessary, change this statement to point at the folder/directory containing the downloaded cdcsourcecode.sas file. The %include will run your data through cdcsourcecode.sas and create a dataset named _cdcdata.
Step 5: The output dataset, _cdcdata, contains your original data and zscores, percentiles, and flags for extreme values shown in Table 2. Additional information on the extreme zscores is given in the Extreme Values, Implausible Values, and Data Errors section.
Step 6: Examine the new dataset, _cdcdata, to verify that the zscores and other variables have been created. If zscores and percentiles for a variable in your dataset are unexpectedly missing, (1) make sure your dataset is named _mydata, and (2) variables are named and coded as shown in Table 1. The program will not modify your original data but adds new variables to your dataset. Table 2 shows the names and descriptions of several variables in _cdcdata.
Description  Variable  Cutoff for Extreme ZScores^{a}  

Percentile  Zscore  Modified Zscore to Identify Extreme Values^{b}  Flag for Extreme Values 
Low (Flag = 1)  High (Flag = +1) 

Weightforage for children aged from 0 to < 240 months  wapct  waz  mod_waz  _bivwt  < 5  >8 
Heightforage for children aged from 0 to < 240 months.  hapct  haz  mod_haz  _bivht  < 5  >4 
Weightforheight for children with heights from 45 to 121 cm (these heights approximately correspond to ages 0 to 6 years)  whpct  whz  mod_whz  _bivwh  < 4  >8 
BMIforage for children aged 24 to < 240 months.  bmipct  bmiz  mod_bmiz  _bivbmi  < 4  >8 
Head circumferenceforage for children aged from 0 to < 36 months  headcpct  headcz  mod_headcz  _bivhc  < 5  >5 
Original calculations for BMI^{c}  original_bmipct  original_bmiz 
a Several cut points were changed in 2016.
b. The names of the modified zscores were changed in Dec 2022. Previously, they began with ‘_F.’
c See sections on LMS Method and Extended BMI percentiles and zscores
Other variables in _cdcdata, as shown in the following table
Variable  Description 

bmi50 and bmi95  Sex and agespecific 50^{th} and 95^{th} percentiles of BMI in the CDC growth charts 
bmip50 and bmip95  BMI expressed as a percentage of CDC’s 50^{th} and 95^{th} percentiles 
bmi120  The BMI value that is 120% of the CDC 95^{th} percentile 
LMS Method
The LMS (lambda, mu, sigma) method calculates BMI zscores as
Zscore = ((BMI / M)^{L}  1) / (L × S) [equation 1]
The L (transformation for normality), M (median), and S (coefficient of variation) values for the CDC growth charts, which vary by sex and month of age, are in CDCref_d.sas7bdat. These zscores are then transformed into percentiles with the SAS probnorm function. For example, a zscore of 1.645 is the 95^{th} percentile. For more information on the LMS method, developed by Tim Cole and PJ Greene in the 1990s, see http://www.ncbi.nlm.nih.gov/pmc/articles/PMC27365 and http://www.ncbi.nlm.nih.gov/pubmed/1518992. Sex and agespecific L, M, and S values are at https://www.cdc.gov/growthcharts/percentile_data_files.htm.
The LMS method for BMI results in a curvilinear relation between BMI and BMIz, as shown in the following figure for children aged 5, 12, and 18 years. The range of BMIs (xaxis) corresponds to those observed in NHANES 19992000 through 20172018. At low BMIs, a small change in BMI results in a sizeable BMIz change. In contrast, at very high BMIs, the same BMI change results in a much smaller BMIz change – this is most evident among 12yearold boys and 18yearold females.
Figure 1. Relation of BMI to BMIz at three ages. BMIz was calculated using the LMS values and equation #1.
Further, if a child’s BMI is very large relative to the median BMI, (BMI ÷ M)^{L} in the LMS equation approaches 0, and the maximum BMIz value that is possible at that sex/age is (1) ÷ (L × S). For most ages over 5 years, the maximum possible BMIz, regardless of the magnitude of the BMI, is < 4.0 SDs. Further, among 7 to 15yearold males and 15 to 19yearold females, BMIz cannot be > 3.3 SDs, limiting the usefulness of these zscores in characterizing the extremely high BMIs (e.g., ≥ 40 kg/m^{2}) shown in the figure above.
The CDC 2000 growth charts were based on data collected from 1963 to 1980 for most children, and it was advised that extrapolation beyond the 97^{th} percentile be done cautiously (1). Further, the 2000 CDC growth charts’ BMI zscores were not intended for use among children with extremely high BMI values (2,3). Several other studies have also highlighted the limitations of LMScalculated BMIz in characterizing very high BMIs (4–6).
Extended BMI percentiles and zscores
To explore alternative metrics for BMI, NCHS convened a workshop in 2018 and published a 2022 report (7) that evaluated several alternatives to LMSBMIz. This report recommended that ‘extended BMIz’ and ‘extended BMI percentiles’ be used to characterize the BMIs of children with obesity (BMI ≥ 95^{th} percentile for a child’s sex and age). These extended metrics were constructed from the BMIs of children with obesity in the CDC growth chart reference population and more recent NHANES surveys (through 20152016). These BMI data were modeled within each sex and 6month age group as a halfnormal distribution, a truncated normal distribution with only values at or to the right of the peak having a probability density greater than 0 (8). Characterizing these distributions’ shape parameter, sigma, allows calculating BMI percentiles for children with obesity, even those with extremely high BMIs. These percentiles can then be transformed into zscores.
To facilitate the use of these extended metrics, as of Dec 15, 2022 (7), the calculated values for BMIz and BMI percentile (bmiz and bmipct) in the SAS program combine the LMSbased values for children without obesity with the extended values for children with obesity. Therefore, the original BMI metrics, constructed using only the L, M, and S parameters, have been renamed as original_bmiz and original_bmipct. Note that bmiz and original_bmiz (and bmipct and original_bmipct) are identical for children without obesity.
The following figure shows the relation of BMI to both the original and new (extended) values of BMIz. The dashed lines represent the original, LMSbased BMI zscores from the 2000 CDC growth charts, whereas the solid lines represent the extended bmiz values for BMIs ≥ 95^{th} percentile (zscore = 1.645 SDs). Among children without obesity, the LMSbased zscores and the new BMI zscore are identical. At higher BMIs, the relation of BMI to bmiz is fairly linear and does not approach a horizontal asymptote. However, the extended BMIz values are lower than the original values for some BMIs above the 95^{th} percentile, which is most evident for 5 and 18yearold males in the figure. These lower values arise because children with obesity in more recent NHANES surveys have higher BMIs than those in the original CDC reference population.
Figure 2. Relation of BMI to Original and New (Extended) BMIz at three ages. Dashed lines represent the original zscores; solid lines are the new zscores
Severe Obesity
Because the original LMSbased zscores for very high BMIs resulted in percentiles that differ from those estimated from the data (3), a BMI ≥ 120% of the 95^{th} of the CDC 95^{th} percentile has been widely used for the classification of severe obesity since 2013 (9). This cutpoint is approximately equal to the empirical 99^{th} percentile in the growth charts (3). However, among older adolescents, a BMI can be ≥ 35 kg/m^{2} but be less than 120% of the 95^{th} percentile. Therefore, severe obesity is defined as either a BMI ≥ 120% of the 95^{th} percentile or a BMI ≥ 35 kg/m^{2}; this aligns with guidelines from the American Heart Association (9) and the American Academy of Pediatrics (10).
The program outputs the variable, bmip95, which expresses a child’s BMI as a percentage of the CDC 95^{th} percentile, which can range from below 50 to over 220. For example, a bmip95 of 140 would indicate that that child has a BMI equal to 1.4 times the 95th percentile. If desired, one can also calculate the arithmetic difference between a child’s BMI and the CDC 95^{th} percentile. For example, the CDC 95^{th} percentile for a 60monthold boy is 17.9 kg/m^{2}. If this 5yearold had a BMI of 21.3 kg/m^{2}, the arithmetic difference would be 3.4 kg/m^{2} (21.3 – 17.9), and bmip95 would be 119% (100 × 21.3/17.9).
Extreme values, Implausible Values, and Data Errors
As explained in the Modified zscores documentation [PDF297KB], the SAS code also calculates modified zscores that can be used to identify extreme values that may be errors. These modified zscores were computed by extrapolating onehalf of the distance between 0 and +2 (or between 0 and 2) zscores to the distribution’s tails. Although these zscores were developed to identify outliers at a single examination, they have been incorporated into algorithms for cleaning longitudinal data (11).
The output from the SAS program contains biologically implausible value (BIV) flag variables for weight, height, and BMI that are coded as 1 (modified zscore is very low), +1 (modified zscore is very high), or 0 (modified zscore is between these 2 cut points). These BIV flags in the output dataset, _cdcdata, were included in Table 2. It is essential to realize that an extreme value is not necessarily incorrect, but the value should be further examined, possibly in conjunction with other characteristics of the child.
The upper thresholds for the modified zscore cutpoints were initially based on a 1995 WHO publication (12) but were changed in 2016. Several papers (13–15) showed that these cut points excluded many children whose weight, height, or BMI were very likely to have been recorded correctly. These BIVs can flag potentially problematic data points, but the BIV cut points are not a gold standard. The cut points were chosen to balance the inclusion of extreme values that are likely to be correct and the exclusion of those that are likely to be incorrect (14,15).
Based on the results of these papers, the upper cut points were increased in 2016 from
(1) +5 to +8 for modified zscores for weight and BMI, and
(2) +3 to +4 for modified zscores for height.
These new zscore cut points roughly correspond to the modified zscores for the maximum values of the body size measures among 2 to 18yearolds in NHANES at many, but not all, ages. However, please be careful in using these cut points to exclude data, as different decisions could alter the prevalence of obesity and severe obesity by up to 1% (14,15).
Other cut points for the modified zscores may be more appropriate based on additional information in your data. For example, does a child with an extremely high BMI also have a high skinfold thickness or arm circumference or is very tall? If so, the very high BMI value is more likely to be correct. Similarly, in a longitudinal study or an analysis of electronic health records (EHR), one could assess whether a child has extreme values of weight and BMI at multiple examinations.
Although +8 SDs is the threshold for a high BMI BIV, two young (< 5 years) boys in NHANES (20052006 and 20172018) have a modified BMIz > 11 SDs. Further, electronic health record datasets that comprise millions of children indicate that many children consistently have a modified BMIz between 10 and 12 SDs at consecutive examinations. Growthcleanr (11), an R package, helps identify errors in longitudinal datasets containing multiple records for each child.
The modified zscores can be used to construct other cut points for extreme values rather than relying on the BIV flag variables. For example, if you feel using a BMIforage cut point of +8 SDs would exclude many values likely to be correct, then you could use mod_bmiz > 10 as the definition of a high BMI BIV. This could be recoded as:
if 5 <= mod_bmiz <= 10 then _bivbmi=0; *plausible; else if mod_bmiz > 10 then _bivbmi=1; *high BIV; else if . < mod_bmiz < 5 then _bivbmi= 1; *low BIV;
 Kuczmarski RJ, Ogden CL, Guo SS, GrummerStrawn LM, Flegal KM, Mei Z, Wei R, Curtin LR, Roche AF, Johnson CL. 2000 CDC Growth Charts for the United States: methods and development. Vital Health Stat 11 2002;11:1–190.
 Flegal KM, Cole TJ. Construction of LMS parameters for the Centers for Disease Control and Prevention 2000 Growth Charts. Natl Health Stat Rep 2013;9:1–3.
 Flegal KM, Wei R, Ogden CL, Freedman DS, Johnson CL, Curtin LR. Characterizing extreme values of body mass indexforage by using the 2000 Centers for Disease Control and Prevention growth charts. Am J Clin Nutr 2009;90:1314–20.
 Woo JG. Using body mass index Zscore among severely obese adolescents: a cautionary note. Int J Pediatr Obes 2009;4:405–10.
 Freedman DS, Butte NF, Taveras EM, Lundeen EA, Blanck HM, Goodman AB, Ogden CL. BMI zScores are a poor indicator of adiposity among 2 to 19yearolds with very high BMIs, NHANES 19992000 to 20132014. Obes Silver Spring Md 2017;25:739–46.
 Freedman DS, Berenson GS. Tracking of BMI z Scores for Severe Obesity. Pediatrics 2017;140:e20171072.
 Hales C, Freedman DS, Akinbami L, Wei R, Ogden CL. Using CDC growth charts to assess and monitor weight status in children and adolescents with extremely high BMI. Natl Cent Health Stat Vital Health Stat 2 2022;197.
 Wei R, Ogden CL, Parsons VL, Freedman DS, Hales CM. A method for calculating BMI zscores and percentiles above the 95th percentile of the CDC growth charts. Ann Hum Biol Taylor & Francis; 2020;47:514–21.
 Kelly AS, Barlow SE, Rao G, Inge TH, Hayman LL, Steinberger J, Urbina EM, Ewing LJ, Daniels SR, American Heart Association Atherosclerosis, Hypertension, and Obesity in the Young Committee of the Council on Cardiovascular Disease in the Young, Council on Nutrition, Physical Activity and Metabolism, and Council on Clinical Cardiology. Severe obesity in children and adolescents: identification, associated health risks, and treatment approaches: a scientific statement from the American Heart Association. Circulation 2013;128:1689–712.
 Armstrong SC, Bolling CF, Michalsky MP, Reichard KW, Haemer MA, Muth ND, Rausch JC, Rogers VW, Heiss KF, Besner GE, et al. Pediatric Metabolic and Bariatric Surgery: Evidence, Barriers, and Best Practices. Pediatrics 2019;144:e20193223.
 Daymont C, Ross ME, Russell Localio A, Fiks AG, Wasserman RC, Grundmeier RW. Automated identification of implausible values in growth data from pediatric electronic health records. J Am Med Inform Assoc JAMIA 2017;24:1080–7.
 World Health Organization (WHO). Physical status: the use and interpretation of anthropometry. Report of a WHO Expert Committee. World Health Organ Tech Rep Ser 1995;854:1–452.
 Lawman HG, Ogden CL, Hassink S, Mallya G, Vander Veur S, Foster GD. Comparing methods for identifying biologically implausible values in height, weight, and Body Mass Index among youth. Am J Epidemiol 2015;182:359–65.
 Freedman DS, Lawman HG, Skinner AC, McGuire LC, Allison DB, Ogden CL. Validity of the WHO cutoffs for biologically implausible values of weight, height, and BMI in children and adolescents in NHANES from 1999 through 2012. Am J Clin Nutr 2015;102:1000–6.
 Freedman DS, Lawman HG, Pan L, Skinner AC, Allison DB, McGuire LC, Blanck HM. The prevalence and validity of high, biologically implausible values of weight, height, and BMI among 8.8 million children. Obes Silver Spring Md 2016;24:1132–9.