## Key Concepts About Developing a Linear Regression in NHANES Using SUDAAN, SAS Survey Procedures, and Stata

### Interpretation of Coefficients

For continuous independent variables, the b coefficient indicates the change in the dependent variable per unit change in the independent variable, controlling for the confounding effects of the other independent variables in the model.   A discrete random variable, X1, can assume 2 or more distinct values corresponding to the number of subgroups in a given category.  For example, in the gender category there are 2 subgroups, men  (Xi =1) and women (Xi = 2).  One subgroup (usually arbitrarily) is designated as the reference group.  The beta coefficient for a discrete variable indicates the difference in the dependent variable for one value of Xi , (e.g., the difference between women and the reference group, men), when all other independent variables in the model are held constant.  A positive value for the beta coefficient indicates a larger value of the dependent variable for the subgroup (women) than for the reference group (men), whereas a negative value for the beta coefficient indicates a smaller value.

#### Interpretation of Coefficients Summary Table

Independent variable type

Examples

What does the b coefficient mean in Simple linear regression?

What does the b coefficient mean in Multiple linear regression?

Continuous

height, weight, LDL

The change in the dependent variable per unit change in the independent variable.

The change in the dependent variable per unit change in the independent variable after controlling for the confounding effects of the covariates in the model.

Categorical  (also known as "discrete")

sex (2 subgroups, men  (sex =1) and women (sex = 2) where one is designated as the reference group (men, in this example).

The difference in the dependent variable for one value of categorical variable (e.g., the difference between women and the reference group, men).

The difference in the dependent variable for one value of categorical variable (e.g., between women and the reference group men), after controlling for the confounding effects of the covariates in the model.

SUDAAN ((proc regress), SAS Survey (proc survey reg), and Stata (svy:regress) procedures produce b coefficients, standard errors for these coefficients, confidence intervals, a t-statistic for the null hypothesis (i.e.,  b =0), a p-value for the t-statistic (i.e., the probability of obtaining a value greater than or equal to the value for the t statistic).

### ANOVA Type Statistical Tests

In addition to the t-test, SUDAAN produces other test statistics with their corresponding p-values. These include the WALD F, Satterthwaite adjusted F, and Satterthwaite adjusted chi square statistics.  SAS Survey procedures only produces the Wald F test with their corresponding p-values.

At the present time, the NHANES Analytic Guidelines do not make a recommendation about which statistic is the " best."   Users are encouraged to frequently check the NHANES website for updated analytic guidelines.  In the meantime, it is a good practice to examine all three statistics and the corresponding p-values for consistency. Users also are encouraged to compare the nominal degrees of freedom (i.e. the number of PSUs minus the number of strata containing observations) to the adjusted Satterthwaite degrees of freedom.  Nominal degrees of freedom that are much larger than the adjusted Satterthwaite degrees of freedom may indicate model instability.

Generally speaking, the Satterthwaite adjusted F is the most conservative of the three statistics (i.e., it rejects the null hypothesis less often than do the other two statistics). Close Window