## Task 1: How to Estimate Mean Food Intakes Using SUDAAN

This section describes how to use SUDAAN to estimate mean food intakes along with standard errors.  To illustrate this, consumption of milk is used as an example.  As explained in the key concepts section, there are different ways to group foods for analysis, and so it is with examining “milk” intakes.  One way is to consider only fluid milk reported separately—not as part of a combination—and another is to account for all milk and milk products—including milk, yogurt and cheese—whether reported separately or as part of a combination or mixture.  In the programs that follow, consumption of fluid milk not in combination, measured in grams, and consumption of all milk and milk products, measured in cup equivalents, are used as examples.

The following analyses are for children ages 6-11, and mean intakes are estimated among users.  Such estimates answer the question: on average, what quantity is consumed in a given day by users of the food?  Analysts interested in per capita consumption (that is, including zeroes for non-consumers) would need to specify that missing values should be set to zero.  See full program under Downloads for note about this.

### Step 1: Sort Data

Before running any SUDAAN procedure, sort the data by strata and PSU, using the PROC SORT procedure.  In the sample code below, CALCMILK is the dataset that was previously created for this analysis with the appropriate variables of interest.

### Step 2: Compute Properly Weighted Estimated Means and Standard Errors

To compute properly weighted estimated means and standard errors, use the PROC DESCRIPT procedure in SUDAAN.  This procedure includes a required nest statement that identifies the variables for strata and PSUs.

In the sample code below, note that the weight variable being used is for the dietary recall Day 1 subsample (WTDRD1).  The SUBGROUP statement indicates that the results will be reported by gender (RIAGENDR), which has two “levels” or categories (male and female).  MILK0 is a variable that was previously created to represent milk consumed outside of a combination.  D_TOTAL is a variable that was previously created to represent total milk group equivalents of intake (see full program in the Additional Resources module to see how these variables were created).  The SUBPOPN statement identifies the subset of people that will be included in the analysis; INCOH is a variable that has value 1 if the individual is “in the cohort” and zero otherwise.  Here, children ages 6 to 11 with complete and reliable recall data have INCOH=1.  Individuals in the cohort who did not report milk have missing values for MILK0.

#### Sample Code

*-------------------------------------------------------------------------;
* Use the PROC SORT procedure to sort the data by strata and PSU.         ;
*                                                                         ;
* Use the PROC DESCRIPT procedure to estimate daily intake of milk as a   ;
* beverage and total milk and milk products.                              ;
*-------------------------------------------------------------------------;

proc sort data =CALCMILK;
by SDMVSTRA SDMVPSU;
run

proc descript data =CALCMILK;
nest SDMVSTRA SDMVPSU;
weight WTDRD1;
subgroup RIAGENDR;
levels 2 ;
tables RIAGENDR;
var MILK0 D_TOTAL;
subpopn INCOH= 1 ;
rformat RIAGENDR GENDER. ;
rtitle "Estimated daily intake of fluid milk drunk by itself as a beverage
and of total milk and milk products" ;
ritle2 "children age 6-11, WWEIA, NHANES 2003-2004 - using SUDAAN" ;
run ;

#### Output of Program

```
Estimated average daily intake of fluid milk drunk by itself as a beverage by
itself as a beverage and of total milk and milk products,
children age 6-11, WWEIA, NHANES 2003-2004 - using SUDAAN

Number of observations read    :   9034    Weighted count :286222757
Number of observations skipped :   1088
(WEIGHT variable nonpositive)
Observations in subpopulation  :    900    Weighted count: 23862559
Denominator degrees of freedom :     15

Variance Estimation Method: Taylor Series (WR)
For Subpopulation: INCOH = 1

----------------------------------------------------------------------------------
|                 |                  |                             |              |
| Variable        |                  | Gender - Adjudicated        |              |
|                 |                  | Total        | Male         | Female       |
----------------------------------------------------------------------------------
|                 |                  |              |              |              |
| Fluid milk (g)  | Sample Size      |          314 |          143 |          171 |
| consumed        | Weighted Size    |   9675209.67 |   4528044.87 |   5147164.80 |
| outside of a    | Total            | ************ | ************ | ************ |
| combination     | Lower 95% Limit  |              |              |              |
|                 |  Total           | ************ | 756809178.30 | 955938528.07 |
|                 | Upper 95% Limit  |              |              |              |
|                 |  Total           | ************ | ************ | ************ |
|                 | Mean             |       347.49 |       395.29 |       305.43 |
|                 | SE Mean          |        29.77 |        45.89 |        27.79 |
|                 | Lower 95% Limit  |              |              |              |
|                 |  Mean            |       284.03 |       297.47 |       246.21 |
|                 | Upper 95% Limit  |              |              |              |
|                 |  Mean            |       410.94 |       493.11 |       364.66 |
-----------------------------------------------------------------------------------
|                 |                  |              |              |              |
| Total number of | Sample Size      |          900 |          422 |          478 |
| milk group      | Weighted Size    |  23862558.64 |  12341904.79 |  11520653.85 |
| (milk, yogurt & | Total            |  56618372.34 |  32033593.30 |  24584779.04 |
| cheese) cup     | Lower 95% Limit  |              |              |              |
| equivalents     |  Total           |  43053858.96 |  23263333.29 |  17735097.44 |
|                 | Upper 95% Limit  |              |              |              |
|                 |  Total           |  70182885.73 |  40803853.31 |  31434460.65 |
|                 | Mean             |         2.37 |         2.60 |         2.13 |
|                 | SE Mean          |         0.13 |         0.16 |         0.14 |
|                 | Lower 95% Limit  |              |              |              |
|                 |  Mean            |         2.09 |         2.25 |         1.83 |
|                 | Upper 95% Limit  |              |              |              |
|                 |  Mean            |         2.65 |         2.94 |         2.44 |
------------------------------------------------------------------- ---------------
```

Highlights from the output include:

• 9034 observations (respondents) were read by the program; 1,088 additional observations were skipped because their sampling weight value was zero (due to recall being unreliable or person otherwise ineligible).
• 900 respondents were included in this analysis; of these, 314 reported milk as a beverage; 143 were boys and 171 were girls.
• The Weighted Size is the sum of the weights for the observations used in this analysis, which is the denominator for computing the mean.
• The Total, which is the numerator for computing the mean, is the weighted sum of all fluid milk or total milk group equivalents reported.  Note that very large numbers are displayed as asterisks in SUDAAN by default.  It is possible, however, to display these numbers by using SUDAAN options to control the output format.  These values, as well as their respective confidence limits, are part of the default output and are not generally relevant to these types of analyses when used alone.
• The mean intake was 395 gm for boys and 305 gm for girls.  These are estimates of the population mean intake of fluid milk on a given day among 6-11 year old boys and girls.  As noted in the Key Concepts section, these means also represent the mean usual intakes of fluid milk for these age-sex groups in the population.
• The mean number of total milk group cup equivalents was 2.60 for boys and 2.13 for girls.  These are estimates of the population mean intake of total milk cup equivalents on a given day among 6-11 year old boys and girls and also represent the mean usual intakes of total milk cup equivalents for these age-sex groups in the population.
• The Lower 95% Limit Mean and the Upper 95% Limit Mean are the bounds of the confidence intervals.  See “Module 16: Test Hypotheses” for more information on confidence intervals.
• RIAGENDR refers to “Gender—adjudicated.”  In this case, adjudicated refers to the fact that a final determination was made for this variable from data provided in multiple parts of the survey.  Sometimes, these data were conflicting, and a determination as to most likely gender was made.

IMPORTANT NOTE

It is important to note that the analysis above was conducted using only children ages 6-11 who were consumers of milk as a beverage. If, however all members (i.e. consumers and non-consumers) of the selected age group were included (total n = 900; 422 males and 478 females), then the average amounts would be lower. For males, the mean milk intake would be 145 gm and for females, it would be 136 gm (see the full Milk program in the Additional Resources section for example code). These means represent the per capita consumption.