Editing and Processing the Injury and Poisoning Data
The redesigned NHIS, fielded since 1997, is conducted using computer-assisted personal interviewing (CAPI). The CAPI version of the NHIS questionnaire is administered using laptop computers that allow interviewers to enter responses directly into the computer during the interviews. The data are later reviewed by NHIS analysts who perform valid code checks, create recoded variables, and, for the injury and poisoning data, impute injury/poisoning episode dates, and determine if an episode should be kept or removed from the file based on specified inclusion criteria.
NHIS analysts assigned to the Family Core Injury Section perform valid code checks on the data to insure that all the responses given by the respondent match the choices that were available in the respective questions, and they check that the skip patterns for the questions were followed correctly.
During the editing process, variable recodes are created and added to the file. These recodes are typically variables that a user would find useful and would either be very complicated for a user to create or cannot be created by the user because the variable is created using in-house variables that are not available to the user because of confidentiality issues. To view a list by data year of all the recoded variables associated with the injury and poisoning section, a description of the recoded variables, and information about the files in which the variable or variables are located, see Table A.
Beginning in 2004 and continuing in the following years, imputation was implemented for injury/poisoning episodes that did not have a complete month, day, and year of occurrence. Imputation was done so that it would be possible to calculate a specific elapsed time in days between the date of the injury/poisoning episode and the date when the injury/poisoning questions were asked for all episodes in the Injury/Poisoning Episode file and the Verbatim Injury/Poisoning Episode file.
Some injury and poisoning episodes are removed from the file each year based on specified inclusion criteria. For a detailed explanation of the inclusion criteria by year, see Table B.
Each injury and poisoning question has a variable associated with it that may or may not be included on an Injury Episode, Poisoning Episode, Verbatim Episode, Injury/Poisoning Episode, or Person public use file. To view a list by data year of all the questions, their respective public use variables, and information about the files in which the variables are located, see Table C.
Injury Episode File
In 1997-1999, injury episodes removed from the Injury Episode files included episodes with no information, episodes that did not occur within the reference period, duplicate episodes, and episodes consisting solely of health conditions that could not be classified according to nature-of-injury codes 800-959 or 980-999 of the Ninth Revision of the International Classification of Diseases (ICD-9-CM). In 1997, there were two instances where the respondent reported a person as having more than four injury episodes. In 1998, there was one instance where the respondent reported a person as having five injury episodes. In 1999, there were five instances where the respondent reported a person as having more than four injury episodes. Because the NHIS only collected detailed information on the four most recent injury episodes, information on additional injury episodes does not exist.
Poison Episode File
In 1997-1999, poisoning episodes removed from the Poison Episode files included episodes with no information, episodes that did not occur within the reference period, duplicate episodes, and episodes that involved illnesses such as poison ivy or food poisoning which are excluded from the survey definition of poisoning. In 1997 there were two instances where the respondent reported a person as having more than four poisoning episodes. Because the NHIS only collected detailed information on the four most recent poisoning episodes, information on these additional poisoning episodes does not exist.
After reviewing the edited 1997-1999 poisoning data, it was discovered that in 1997 there were 47 episodes, in 1998 there were 33 episodes, and in 1999 there were 28 episodes coded “06" (Something else) for question FIJ.340 (POITP) that did not meet the criteria for poisoning. Rather than remove these episodes, a new variable (POITPR2) was created for each year that contained the original categories in variable POITP and added additional categories that could be used to classify those episodes that may not have been poisonings. Those episodes for each year were recoded to values “07" (“Allergic/adverse reaction to medication or other substance”) or “08" (“Something else - NOT poisoning”). The latter value includes such things as spraying paint or hair spray into the eyes, chemotherapy, and sun poisoning.
Injury Verbatim File and Verbatim Injury/Poisoning Episode File
The 1997-1999 Injury Verbatim files contain the edited narrative text descriptions, provided by the respondent, for the injury, including the body part injured, the kind of injury, and a description of how the injury happened. The 2000-2003 Verbatim Injury/Poisoning Episode files contain edited narrative text descriptions of the injury or poisoning provided by the respondent, including the body part injured or poisoned, the kind of injury or poisoning, and a description of how the injury or poisoning happened. The 2004-present Verbatim Injury/Poisoning Episode file contains edited narrative text descriptions of the injury or poisoning provided by the respondent and includes a description of how the injury or poisoning happened and “other specified” responses for the body part injured, the kind of injury, the place the person received medical care, the cause of the poisoning, and the activity at the time of the injury/poisoning. The pre-edited responses are “verbatim” only insofar as the interviewer could write them down and condense them to fit the field size. Editing was done only to protect the injured person’s confidentiality. Text descriptions used to replace any original noncompliant text are surrounded by arrow brackets ( < > ). Grammatical and/or spelling errors were not corrected. Beginning in 1998, the codes of “R,” which represents “Refused;” “D” or “DK,” which represent “Don’t know;” and “N,” which represents “No more information” have also been left in the file.
Injury/Poisoning Episode File
Beginning in 2000, the Injury Episode file and the Poisoning Episode file no longer existed as separate files. Instead, one file was created from the survey data that contained information about both injuries and poisonings. During the 2000-2003 data editing process, as in previous years, some injury and poisoning episodes were removed from the files. These included episodes with no information regarding cause, date and place of occurrence, etc., episodes that did not occur within the reference period, and duplicate episodes. In addition, injury episodes were removed if they consisted solely of health conditions that could not be classified according to nature-of-injury codes 800-999 of the ICD-9-CM. Also, as in previous years, respondents reported episodes that they considered poisonings (e.g., food poisoning and allergic reactions) but that are not considered poisonings based on the ICD-9-CM. These types of episodes are still included in the file but are now covered by question FIJ.195 (POITP) under categories “06” (food poisoning) and “07” (allergic reaction).
During the 2004-present data editing process, the NHIS staff continued to remove injury and poisoning episodes with no information regarding cause, date and place of occurrence, etc., and duplicate episodes. In addition, in 2004, episodes were removed if they consisted solely of health conditions that could not be classified according to nature-of- injury codes 800-909.2, 909.4, 909.9, 910-994.9, 995.5-995.59, and 995.80-995.85 of the ICD-9-CM. For the 2005-present data, episodes were removed if they consisted solely of health conditions that could not be classified according to nature-of- injury codes 800-909.2, 909.4, 909.9, 910-994.9, 995.5-995.59, and 995.80-995.85 of the ICD-9-CM and did not have at least one external cause of injury code of E800-E848, E850-E869.9, E880-E929.0, or E950-E999. As in previous years, respondents reported episodes that they considered poisonings (e.g., food poisoning and allergic reactions) but that are not considered poisonings based on the ICD-9-CM. These types of episodes were included in the 1997-2003 data files. Beginning in 2004 and continuing through to the present, episodes that are not considered poisonings based on ICD-9-CM are no longer included in the Injury/Poisoning Episode data files.
Background for Data Processing
Prior to 1997, the NHIS was conducted using a paper questionnaire. Interviewers recorded the responses, including verbatims, the source(s) of the injury response within the questionnaire (e.g., INJ for the two-week injury question or RA for restricted activity days), and the names of conditions as they arose on the form (see below). Once the paper questionnaire was transmitted to the NCHS staff, much of the other information was keyed into mainframe computers. However, the condition information, including the original listing in C2 (see below) and the information entered on the condition page, first underwent a separate process of ICD coding by trained staff.
The core questions that generated injury condition records, including the question that explicitly asked about injuries in the previous two weeks, were atypical in that they did not appear as variables on the person files. They existed in the questionnaire as probes for disability days, doctor visits, and hospitalizations as well as injuries but were not keyed into the electronic record. For each condition recorded in the questionnaire booklet, a separate condition page (and record) was generated and an ICD (International Classification of Diseases) code, and in later years an E-code (external cause code), was assigned. The respondent was asked similar questions for each condition, with two exceptions. For conditions elicited by the list of generally chronic conditions asked in that household, an additional set of questions were asked, and, for those conditions resulting from an accident, a different set of questions was added. For injuries, also recorded was whether that condition was the first (or only) injury resulting from an accident or whether it was one of multiple conditions from the same injury episode. This was used to discriminate between multiple injuries from a single episode or single injuries from separate episodes (see below). For the second and additional injuries from the same episode, the “8 Other” box would be checked.
As mentioned above, from 1963 through 1996, much of the information collected on the core condition page was not keyed into the record. This included the name of the condition and the part(s) of the body injured; these entries were used only to assign the correct ICD code. Note that the 7th revision of the ICD codes was used from 1963 to 1968, the 8th revision from 1969 through 1988, and the 9th revision from 1989 through 1996. The data that were keyed and remained in the record included the place and cause of the injury and whether or not medical services were used and when. In the process of entering the data, staff was trained to merge duplicate records or to create additional records when separate injuries were not entered by the interviewer.
Once keyed, the only editing was to delete entries not corresponding to skip patterns and to assign a code to inappropriate or missing data. Responses falling outside the code categories for a question were given an “Unknown” code. It was not until later years in this period that this unknown category was differentiated into refused, don’t know, and not ascertained (missing but should have been answered) and, until 1997, this distinction appeared only in supplements.
Before 1969, in all sections of the data files, some variable values contained special characters such as “&”, “X”, and “-“ which usually represented not reported/no entry or unknown but could represent legitimate response codes. These were virtually always codes created in the coding instructions and used by staff coders rather than printed labels for the categories on the questionnaire. Many of these characters were used to avoid adding an additional column to the keying because space was a major concern in the early days of the mainframe computers.
There is a major exception to the use of “X” mentioned above; the 7th revision ICD codes had no provision for many of the existing conditions that the NHIS terms “impairments”. These impairments include such conditions as blindness, missing limbs, etc., some of which were caused by injuries. In order to code these conditions, most of which were chronic and had occurred more than three months previously, the NHIS created a series of two-digit codes preceded by an “X” to describe the nature of the impairment. It also included a fourth decimal suffix that designated a cause; “0” meant that it was caused by an injury. Each year some of these injury-caused impairments were acute in terms of time but coded as chronic because they were “permanent”, such as limbs amputated in an accident. Even though later versions of the ICD codes did have outcome codes for these conditions, the NHIS continued to use its own X-codes through 1996.
Supplemental questionnaires that addressed accidents and injuries, both those causing actual injuries and those that dealt with preventing injuries, were handled very differently. Typically, all data were keyed and remained on the file with little or no editing other than “enforcing” skip patterns and assigning “unknown” codes.