Calculating chi-square Using Stata Command svy:tabulate

Statements Explanation
use "C:\NHANES\DATA\analysis_data.dta", clear

Use the use command to load the Stata-format dataset. Use the clear option to replace any data in memory.

svyset [w=wtsaf4yr], psu(sdmvpsu) strata(sdmvstra) vce(linearized)


Use the svyset command to declare the survey design for the dataset. Specify the psu variable sdmvpsu. Use the [w=] option to account for the unequal probability of sampling and non-response.  In this example, the MEC fasting weight for four years of data (wtsaf4yr) is used because this analysis uses four years of data and serum triglyercide measurements obtained from persons who fasted nine hours and were examined in the morning at the MEC. Use the strata ( ) option to specify the stratum identifier (sdmvstra). Use the vce( ) option to specify the variance estimation method  (linearized) for Taylor linearization. This is the default method if the option is not specified.

svy:tab riagendr cuff, subpop (if ridageyr >=20 & ridageyr<.) column row obs percent pearson null wald


Use the svy : tabulate command  to produce two-way tabulations for gender (riagendr) and blood pressure cuff size (cuff) with tests of independence for people age 20 years and older. (See Section 5.4 of Korn and Graubard Analysis of Data from Health Surveys, pp 207-211).



Use the subpop( ) option to select a subpopulation for analysis, rather than select the study population in the Stata program while preparing the data file. This example uses an if statement to define the subpopulation based on the age variable's (ridageyr) value. Another option is to create a dichotomous variable where the subpopulation of interest is assigned a value of 1, and everyone else is assigned a value of 0.

Options for the tab command include:

  • column and row to display column and row percentages (if you do not specify this you will get cell proportions);
  • obs lists the number of observations in each cell; count lists the weighted n in each cell and by adding format(%11.0fc) you will display the counts with commas rather than scientific notation;
  • ci gives the confidence interval around each estimate, but can only be used with either row or column, not both; and
  • the Pearson (Rao-Scott correction F-statistic) chi-square, null-based and Wald test statistics.

The options specified for this example, use the column, rows, obs, percent, pearson, null and wald test statistic options.

Please consult the Stata 9 SURVEY DATA [SVY] manual for explanations of the test statistics available for svy:tabulate.