# A SAS Program for the WHO Growth Charts (ages 0 to <2 years)

The purpose of this SAS program is to calculate the percentiles and Z-scores (standard deviations) for a child’s sex and age from birth up to 2 years of age for BMI, weight, height, skinfold thicknesses (triceps and subscapular), arm circumference, and head circumference based on the WHO Growth Charts. Weight-for-height z-scores and percentiles are also calculated. Observations that contain extreme values (absolute z-scores above 5 or 6) are flagged as being biologically implausible. Although WHO provides several macros and a PC program for these calculations, this SAS program follows the same steps as does the SAS program for the CDC growth charts. Additional details about the ages for which the various z-scores and percentiles are calculated are given in Table 2 (below).

The SAS program, WHO-source-code.sas (files are below, in step #1), calculates these z-scores and percentiles based on reference values in WHOref_d.sas7bdat. This reference data set combines values from several WHO datasets. If you’re not using SAS, you can download WHOref_d.cvs [CVS-160KB], and create a program based on who-source-code.sas [SAS-6KB] to do the necessary calculations.

## Instructions for SAS users

**Step 1**: Download the SAS program (who-source-code.sas [SAS-6KB]) and the reference data file (WHOref_d.sas7bdat). Do not alter these files, but move them to a directory (folder) that SAS can access. If you are using Chrome or Firefox, right click to save the who-source-code.sas file.

For the following example, the files have been saved in c:\sas\growth charts\who\data.

**Step 2**: Create a libname statement in your SAS program to point at the folder location of ‘WHOref_d.sas7bdat’. An example would be:

libname refdir ‘c:\sas\growth charts\who\data’;

Note the SAS code expects this name to be *refdir*; do not change this name.

**Step 3**: Set your existing dataset containing height, weight, sex, age and other variables into a temporary dataset, named *mydata*. Variables in your dataset should be renamed and coded as follows:

**Table 1:** Instructions for SAS users (step 3), guidance on renaming and coding variables in your dataset.

Variable | Description |
---|---|

agedays | Child’s age in days; must be present. If this value is not an integer, the program rounds to the nearest whole number.
If age is known only to the completed number of weeks (e.g., 5 weeks of age would represent any number of days between 35 and 41), multiply by 7 and consider adding 4 (median number of days in a week). If age is known only to the completed number of months, multiply by 365.25/12, and consider adding 15. |

sex | Coded as 1 for boys and 2 for girls. |

height | Recumbent length in cm. If standing height (rather than recumbent length) was recorded, add 0.7 cm to the values (see http://www.ncbi.nlm.nih.gov/pubmed/?term=16817681. |

weight | Weight (kg) |

bmi | BMI (weight (kg) / height (m)^{2}). If your data doesn’t contain BMI, the program calculates it. If BMI is present in your data, the program will not overwrite it. |

headcir | Head circumference (cm) |

armcir | Arm circumference (cm) |

tsf | Triceps skinfold thickness (mm) |

ssf | Subscapular skinfold thickness (mm) |

Z-scores and percentiles for variables that are not in *mydata* will be coded as missing (.) in the output dataset (named *_whodata*). Sex (coded as 1 for boys and 2 for girls) and agedays must be in *mydata*. It’s unlikely that the SAS code will overwrite other variables in your dataset, but you should avoid having variable names that begin with an underscore, such as _bmi.

**Step 4**: Copy and paste the following line into your SAS program after the line (or lines) in step #3*. *

%include ‘c:\sas\growth charts\who\data\WHO-source-code.sas’; run;

If necessary, change this statement to point at the folder containing the downloaded ‘WHO-source-code.sas’ file. This tells your SAS program to run the statements in ‘WHO-source-code.sas’.

**Step 5**: Submit the %include statement. This will create a dataset, named *_whodata*, which contains all of your original variables along with z-scores, percentiles, and flags for extreme values. The names and descriptions of these new variables in *_whodata* are in Table 2.

**Table 2:** Z-Scores, percentiles, and extreme values (biologically implausible, BIV) in output dataset, _*whodata*

Description |
Variable |
Cutoff for Extreme Z-Scores |
|||
---|---|---|---|---|---|

Percentile |
Z-score |
Flag for Extreme Values |
Low z-score (Flag coded as -1) | High z-score (Flag coded as +1) | |

Weight-for-age for children between 1 and 731 (inclusive) days of age |
wapct | waz | _bivwt | < -6 | > 5 |

Height-for-age for children between 1 and 731 days of age |
hapct | haz | _bivht | < -6 | >6 |

Weight-for-height for children with heights between 45 and 110 cm |
whpct | whz | _bivwh | < -5 | >5 |

BMI-for-age for children between 1 and 731 days of age. Note that for children under 2 y of age, weight-for-height, not BMI-for-age, is recommended. |
bmipct | bmiz | _bivbmi | < -5 | >5 |

Head circumference-for-age for children between 0 and 731 days of age |
headcpct | headcz | _bivhc | < -5 | >5 |

Arm circumference-for-age for children between 91 and 731 days of age |
armcpct | armcz | _bivac | < -5 | >5 |

Subscapular skinfold thickness-for-age for children between 91 and 731 days of age |
ssfpct | ssfz | _bivssf | < -5 | >5 |

Triceps skinfold thickness-for-age for children between 91 and 731 days of age |
tsfpct | tsfz | _bivtsf | < -5 | >5 |

**Step 6:** Examine the new dataset, *_whodata*, with PROC MEANS or some other procedure to verify that the z-scores and other variables have been created. If a variable in Table 1 was not in your original dataset (e.g., arm circumference), the output dataset will indicate that all values for the percentiles and z-scores of this variable are missing. If values for other variables are unexpectedly missing, make sure that you’ve renamed and recoded variables as indicated in Table 1 and that your SAS dataset is named *mydata*. The program should not modify your original data, but will add new variables to your original dataset.

**Example SAS code corresponding to steps 2 to 6**. You can simply cut and paste these lines into a SAS program, but you’ll need to change the libname and %include statements to point at the folders containing the downloaded files.

libname refdir ‘c:\sas\growth charts\who\data’;

data mydata; set whatever-your-original-dataset-is-named;

%include ‘c:\sas\growth charts\who\data\WHO-source-code.sas’;

proc means data=_whodata; run*;*

**Additional Information**

Z-scores are calculated as

Z = [ ((value / M)**L) – 1] / (S * L) ,

in which ‘value’ is the child’s BMI, weight, height, etc. The L, M, and S values are in WHOref_d.sas7bdat and vary according to the child’s sex and age or according to the child’s sex and height. Percentiles are then calculated from the z-scores (for example, a z-score of 1.96 is equal to the 97.5 percentile). For more information on the LMS method, see http://www.ncbi.nlm.nih.gov/pmc/articles/PMC27365/

**Extreme or Biologically implausible Values**

The SAS code also flags extreme values (__b__iologically __i__mplausible __v__alues, or BIV) according to the WHO criteria at https://www.who.int/toolkits/child-growth-standards/software. Each variable has a BIV flag that is coded as -1 (an extremely low z-score), +1 (extremely high z-score), or 0 (the z-score is between the low and high cut-points). These BIVs flags, along with other variables that are in the output dataset, *_whodata*, are shown in Table 2.

The z-scores in the output data set, _*whodata*, can also be used to construct other cut-points for extreme (or biologically implausible) values. For example, if the distribution of weight in your data is strongly skewed to the right, you might use bmiz > 7 (rather than bmiz > 5) as the cut-point for extremely high BMI-for-age. This could be recoded as:

if -5 <= bmiz <= 7 then _bivbmi=0; *plausible;

else if bmiz > 7 then _bivbmi=1; *high BIV;

else if . < bmiz < -5 then _bivbmi= -1; *low BIV;

There are also 2 overall indicators of extreme values in the output dataset: _bivlow and _bivhigh. These variables indicate whether any measurement is extremely high ( _bivhigh=1) or extremely low (_bivlow=1). If a child does not have an extreme value for any measurement, both variables are coded as 0.

**Connect with Nutrition, Physical Activity, and Obesity**