2003 Imputed Family Income/Personal Earnings Files
The missing data on family income and personal earnings in the 2003 NHIS were imputed using multiple-imputation methodology. Five ASCII data sets containing imputed values for the 2003 survey year are included in the compressed data file (INCMIMP.EXE), which can be downloaded via the Datasets link below. For analyses involving other variables in addition to family income or personal earnings, each set of imputed values can be merged with other data from the 2003 NHIS to create a single completed data set. Multiple imputation is a technique that allows analysts to incorporate the extra variability due to imputation into their analyses. This is accomplished by analyzing EACH OF THE FIVE completed data sets separately using methods and software that are appropriate for survey data and then combining the estimates and standard errors using the combining rules described in Section 2.2 and Appendix A of the document available via the Technical Documentation link below. The extra variability due to imputation CANNOT be incorporated by simply analyzing a SINGLE completed data set as if the imputed values were true values. Moreover, analysts SHOULD NOT create a single completed data set using the AVERAGE of the five sets of imputed values. Examples of correct data analyses using SAS-callable SUDAAN and SAS-callable IVEware are provided in Section 4 of the document available via the Technical Documentation link below; the document also provides information on the procedures used to create the imputations. The Dataset Documentation link below opens to a document containing both the file layout description and the frequency counts (in the last page) of the variables in the data sets containing imputed values for the 2003 survey year. Users interested in data for several years should note that to date, multiple imputation has been carried out for the 1997-2003 NHIS, and that the file layout description is identical for those years. Users are also encouraged to check the NHIS website for updates and to subscribe to the NHIS Listserv to receive notices of any corrections/updates.
UPDATE: In October 2005, the 2003 NHIS Imputed Family Income/ Personal Earnings files were revised in order to correct values of the variable RAT_CATI (ratio of family income to poverty threshold group) for 1,032 persons. The corrections affect persons in families consisting of one adult and three children, regardless of whether income was imputed. For persons in these families the calculation of the ratio of family income to poverty threshold has been revised to use the correct poverty threshold value. The frequency of RAT_CATI was also updated in the Dataset Documentation. Only data year 2003 was affected by this update.
- Technical Documentation: Methods and Examples [PDF - 813 KB]
- Datasets [EXE - 2 MB]
- Dataset Documentation [PDF - 133 KB]