Stata Non-Survey Command for Descriptive Statistics

Statements Explanation

use "C:\Stata\tutorial\analysis_data.dta", clear

Use the use command to load the Stata-format dataset. Use the clear option to replace any data in memory.

 

by riagendr age, sort : summarize lbxtc [aweight = wtmec4yr] if (ridageyr >=20 & ridageyr <.) & ridstatr==2, detail

Use the sort command with the by prefix to sort and display the data by gender (riagendr) and age (age). Use the summarize command to generate univariate summary statistics (number of observations, sum of weights, mean, standard deviation) for the total cholesterol variable (lbxtc), for those who are 20 years and older and have been both interviewed and examined (ridstatr=2). Use the [aweight=] option to account for the NHANES sampling weights (obtain survey weighted estimates). In this example, the MEC weight for four years of data [aweight=wtmec4yr] is used. Note in this case the aweights as normally defined by Stata, that is weights inversely proportional to the variance of an observation, are NOT used."

histogram lbxtc, by(riagendr age), if (ridageyr >=20 & ridageyr <.) & ridstatr==2, normal

Use the histogram command to draw a histogram of the total cholesterol variable (lbxtc) for a select subpopulation (ages 20 and over). Use the normal option to overlay the histogram with normal density.

graph box lbxtc [pweight = wtmec4yr], medtype(line) over(riagendr) over(age),  if (ridageyr >=20 & ridageyr <.)& ridstatr==2  

Use the graph box command to box plot the total cholesterol data, by gender and age for those who are 20 years and older and have been both interviewed and examined (ridstatr=2). Use the [pweight=] option to account for the unequal probability of sampling and non-response.  In this example, the MEC weight for four years of data (wtmec4yr) is used. Use the medtype option to indicate how the median is indicated in the box.