There are two series of commands you can use analyze NHANES in Stata.

SVY commands are a **series of commands specifically designed to analyze complex survey designs
like NHANES**.
To calculate the means and standard errors, you would use Stata survey (*svy)
*commands because they account for the complex survey design of NHANES data
when determining variance estimates. These commands can be used for simple random samples also.

Whenever you want to use SVY commands, you need to set up
Stata by defining the survey design variables using the *svyset* command. This command has the general structure:

svyset [w= weight], psu(psu variable) strata(strata variable)

Here is the command using the 4-year weight for data collected in the MEC and the output:

svyset [w= wtmec4yr], psu( sdmvpsu) strata(sdmvstra)

(sampling weights assumed)

pweight: wtmec4yr

VCE: linearized

Single unit: missing

Strata 1: sdmvstra

SU 1: sdmvpsu

FPC 1: <zero>

Once you do this, Stata remembers these variables and
applies them to every subsequent SVY command. **If you save the dataset, Stata
will remember these variables and apply them automatically when you reopen the
data set. **

You can change these variables any time you want by typing a new SVYSET command.

Standard commands are regular Stata commands
that can incorporate sampling weights. For example, if standard errors are not
needed, you can simply use regular Stata commands with the weight variable (i.e., *mean *with the *
weight *variable*) *to calculate means.

You only need to **use
these commands when there is no corresponding SVY command**. When you use these
commands, keep in mind that:

- Not all standard commands will take weights.
**With weights, these analyses will generate accurate point estimates.**- Because standard commands
do not use the design variables (i.e. strata, psu),
**they will NOT generate accurate standard errors.**

NHANES data files are very big; you will encounter memory problems unless you change some of Stata's default settings. If you don't you'll be plagued by messages like:

. use "/WoloHD/Teaching/CECS/ECS 122 2005/Classes/Week 6/lab6/lab6.youth.adult.lab.final.dta"

no room to add more observations

An attempt was made to increase the number of observations beyond what is

currently possible. You have the following alternatives:

1. Store your variables more efficiently; see help compress. (Think of

Stata's data area as the area of a rectangle; Stata can trade off width

and length.).).)

2. Drop some variables or observations; see help drop.

3. Increase the amount of memory allocated to the data area using the set

memory command; see help memory.

The solution is simple, just tell Stata to make more room. The syntax is simple, you just tell Stata how much memory to set aside for data. Functionally, the only limit is the size of your hard drive.

set memory 1g

(*the 1g means 1 gigabyte....you could try a smaller -- like 100m [m for megabyte] -- or larger. Don't be scared to experiment. If you want you can set the memory "permanently" (that is until you manually reset it) type:

set memory 1g, permanently

WARNING

Do not drop observations from the dataset. This may affect variance estimation.

Stata cares about the case of the letters - so if your dictionary has all capital letters, you will always need to use caps and visa versa. The only requirement is that you use the NHANES variable names. So, when you write the data dictionary, it is your choice of all caps or all small letters. If you click on the variables in the "variable box", you don't need to worry about this.

Stata represents missing numeric values (".") as large numeric values. So, unlike SAS Survey Procedures or SUDAAN, which would place missing values at the bottom of the range, Stata will place them at the top of the range.