# Review Data & Create New Variables

### Purpose:

Reviewing NHANES environmental chemical data and creating new variables may be necessary before you can use the variables in the dataset. NHANES environmental chemical data may need to be adjusted if the dataset has missing data or outliers. Depending on the purpose of your analysis, you may also need to create new variables (e.g., to create a category variable based on level of detection).

### Task 1: Identify, Recode, and Evaluate Missing Data

Missing values may distort your analysis results. You must evaluate the extent of missing data in your dataset to determine whether the data are useable without additional re-weighting for item non-response.

### Task 2: Check Distributions and Describe the Impact of Influential Outliers

Before you analyze environmental chemical data, it is very
important that you **check the distribution and normality
of the data,** identify outliers, and determine how
outliers might affect your analysis.

### Task 3: Check for Data Symmetry

Many statistical procedures are based on the assumption that data are normally distributed, and therefore, symmetrically distributed. However, the distributions of environmental chemical concentrations in blood or urine are often skewed.

### Task 4: Create New Variables

Recoding is an important step for
preparing an analytic dataset. You may want to recode
variables or **create new variables** that fit your
analytic needs.

##### Contact Us:

- National Center for Health Statistics

3311 Toledo Rd

Hyattsville, MD 20782 - 1 (800) 232-4636
- cdcinfo@cdc.gov