2006-2010 NSFG: Public Use Data Files, Codebooks, and Documentation
Codebooks and Documentation
- Codebooks:
- Webdoc, the NSFG’s interactive online codebook, was deactivated as of 12/31/20. Public-use file indexes (Appendix 1a, 1b, and 1c linked below) can be searched to find variable names or identify relevant variables based on key words in variable labels. The section letters noted in the file indexes correspond to the sections of the codebook, so the file indexes indicate which codebook PDF will contain your variables of interest.
- Female Respondent File Codebook
- Female Pregnancy (Interval) File Codebook
- Male Respondent File Codebook
- User’s Guide
- Main Text [PDF – 595 KB]
- Appendix 1: File Indexes for 2006–2010 NSFG
- Appendix 2: SAS and STATA Syntax Guidelines for Common File Manipulations [PDF – 77 KB]
- Appendix 3: Recode Specifications for 2006–2010 NSFG
- Appendix 4: Recode “Cross-walk” Grids
- Appendix 5: Summary of NSFG Questionnaire Changes – Years 1, 2, and 3 of NSFG 2006-2010 [PDF – 178 KB]
- Appendix 6: Frequently Asked Questions about the NSFG [PDF – 83 KB]
- Appendix 7: List of Restricted Use Variables Available through the RDC
- Notes for Users:
Variance Estimation Examples
Variance estimation examples that can be modified to analyze the 2006-2010 NSFG data are provided on the Cycle 6 (2002) webpage. Those examples are intended to cover many of the common types of variance estimation. Note that the stratum, cluster, and weight variables should be modified to reflect the appropriate 2006-2010 variable names. (See tables below.) See also the 2006-2010 NSFG User’s Guide, starting on page 13 for more information on sample weights and variance estimation. If you have any questions about any of these examples, please e-mail the NSFG staff at nsfg@cdc.gov.
1995, 2002, and 2006-2010 Design and Weight Variables
Design variable | 19951 | 2002 | 2006-2010 |
---|---|---|---|
Stratum variable
|
COL_STR
|
SEST
|
SEST
|
Cluster/Panel Variable
|
PANEL
|
SECU_R (fem resp)
SECU_P (fem preg) SECU (male resp) |
SECU
|
Final post-stratified, fully adjusted case weight
|
POST_WT
|
FINALWGT
|
WGTQ1Q16 – when analyzing entire 2006-2010 data (see next table for other available weights)
|
1Only females were interviewed in 1995.
Weight variables for 2006-2010 NSFG
Weight variable | Used for Interviews Conducted in: |
---|---|
WGTQ1Q16
|
All 16 quarters, June 2006-June 2010 (for analysis of entire 2006-2010 data)
|
WGTQ1Q8
|
Quarters 1 – 8, June 2006-June 2008 (for analysis of 1st 2 years of 2006-2010 data
|
WGTQ5Q16
|
Quarters 5 to 16, July 2007-June 2010 (for items added in Year 2)
|
WGTQ9Q16
|
Quarters 9 to 16, July 2008-June 2010 (for items added in Year 3 and for analysis of last 2 years of 2006-2010 data)
|
FINALWGT30
|
Quarters 1-10, June 2006-December 2008 (weight provided with the 2006-2008 files)
|
Questionnaires
- Description of the two questionnaire formats
- Female Questionnaire, Year 1
- Male Questionnaire, Year 1
- Female Spanish Questionnaire, Year 1
- Male Spanish Questionnaire, Year 1
- Female Questionnaire, Year 2
- Male Questionnaire, Year 2
- Female Questionnaire, Years 3 and 4
- Male Questionnaire, Years 3 and 4
Downloadable Data Files
- Female Respondent Data File (2006_2010_FemResp.dat)
- Female Pregnancy Data File (2006_2010_FemPreg.dat)
- Male Respondent Data File (2006_2010_Male.dat)
Program Statements
- SAS Program Statements
- SPSS Program Statements
- STATA Program Statements
- Female respondent file
- Pregnancy file
- Male respondent file
Other Data Files
In addition to the main 2006-2010 public use data files from the NSFG, the 2006-2010 ACASI data files and the Contextual data files are available now. Below are instructions on how to access these data files.
ACASI Data files: Due to a change in NCHS policy, made effective in March 2020, the ACASI Data Files for 2006-2010 are no longer accessible via special data use agreement. These files can only be accessed through the NCHS Research Data Center (RDC). For more information about using the RDC, including access and associated charges, visit the RDC website.
The 2006-2010 NSFG questionnaires contained a number of items designed to provide a comprehensive description of current and past behavior related to the risk of acquiring sexually transmitted infections (STI), including the Human Immunodeficiency Virus, or HIV, the virus that causes AIDS. These questions were asked via Audio Computer-Assisted Self-Interviewing, or ACASI, in which the respondent hears the question through headphones or reads it from the laptop screen and enters the answer directly into the computer. The object of ACASI was to give respondents a more private opportunity to report this sensitive information.
The ACASI files include most of the items from the ACASI portion of the 2006-2010 NSFG interview (female section J and male section K). The series on income and sources of income were collected in ACASI, but they are included on the main 2006-2010 NSFG Public Use Files released in October 2011. Height and weight were also asked in ACASI; these variables, along with body-mass index (BMI) are available on the main public use files for 2006-2010.
- The questions included in ACASI were largely the same for male and female respondents.
- Comparable items were asked about drug use, risk behaviors for sexually transmitted infections (STI, including HIV), and experience with STI.
- Both male and female respondents were given an opportunity to re-report their experience with pregnancies or fathering pregnancies that were previously reported directly to the interviewer.
- All adult respondents (18-44) were asked about non-voluntary sexual intercourse and types of force they may have experienced, if they reported non-voluntary intercourse.
- While the main interviewer-administered portion of the NSFG interview was limited to heterosexual vaginal intercourse, in ACASI all respondents were asked about other types of sexual activity, including oral and anal sex and same-sex partners.
The User’s Guide [PDF – 649 KB], including file indexes and codebook documentation, for the 2006-2010 ACASI files is available for download. All researchers interested in using ACASI data are strongly encouraged to review the User’s Guide before embarking on their analyses. As described in the User’s Guide, the ACASI data must be used in conjunction with the main public use data files, as there are no weights or sample design variables included in the ACASI files.
For additional information or questions about these files, researchers may contact the NSFG staff at nsfg@cdc.gov
Interviewer Observation Data file: The Interviewer Observations File is a special data file that is part of the 2006-2010 National Survey of Family Growth (NSFG). It contains responses from the interviewers about the respondent and the interview setting. While the respondent completed the ACASI portion of the interview on the NSFG laptop, the interviewer filled out a paper questionnaire called the Interview Observation Form.
The Interview Observation Form contains questions about several aspects of the interview process. The data file provides information useful for survey planning and management, for evaluating data quality, and for assessing the effectiveness of interviewer training.
The data file contains responses from the interviewers (all female) for most of the 22,682 respondents in the 2006–2010 NSFG. The list of variables included in the data file can be found here Interview Observation File [PDF – 22 KB]. The Interviewer Observation Data file is intended to be used in conjunction with the NSFG public-use data files. Specifically, to obtain weighted national estimates or accurate variance estimates, the user must merge weighting and sample design variables from the public use files. Please refer to the User’s Guide [PDF – 596 KB] or the 2006–2010 NSFG Public Use Files for descriptions of the data preparation procedures, sample design, and coding conventions. The Interviewer Observations data file and additional documentation are available through the NCHS Research Data Center (RDC). For further information about these data, contact NSFG staff at nsfg@cdc.gov. For information about using these data, please visit the RDC website or email rdca@cdc.gov.
Contextual Data files: The contextual data files for the 2006-2010 NSFG, which include information on the context or community in which respondents live, are now available to the research community. Contextual data files for the 1995, 2002, and 2006-2010 NSFGs are accessible only through the NCHS Research Data Center due to the increased risk of deductive disclosure of respondents’ identities when geographic variables are linked to survey data.
There are 2 contextual data files for each respondent. These correspond to the respondent’s address at 2 points in time: 1) at the date of interview and 2) on April 1, 2000 (the time of the 2000 U.S. Census). Geographic variables are provided at the state, county, tract, block group, and block level in these data files. Identifiers for each of these geographic units are also available that allow researchers to merge other, external data with the NSFG survey data.
The variables in the contextual data files are drawn from these sources:
- NSFG 2006-2010 sample data;
- Census 2000 Summary Files 1 and 3 (SF1 and SF3);
- American Community Survey 5-Year Estimates Summary File, 2005-2009 (ACS5);
- County Characteristics, 2000-2007 (ICPSR 20660);
- Centers for Disease Control and Prevention data on sexually transmitted diseases
- for the place-at-interview file, 2006 data (CDC06)
- for the file for residence on April 1, 2000, 2000 data (CDC00); and
- Guttmacher Institute data of abortion and family planning services
- for the place-at-interview file, 2005-2006 data (AGI05_06)
- for the file for residence on April 1, 2000, 2000-2001 data (AGI00_01).
The codebook 2006-2010 NSFG codebook for place of interview [PDF – 2.2 MB] and the list of variables contained in the 2006-2010 place-at-interview data file 2006-2010 NSFG list of variables [PDF – 177 KB] are available for downloading. Variables with 11,000 or more cases with missing values are not included in the data files. (Please see Chapter 4 of the 2006-2010 Codebook for Place at Interview for details on missing values.)
Some changes were made in the creation of the 2006-2010 contextual data files compared with earlier NSFGs. For prior NSFG cycles, contextual variables came from information collected on the long-form decennial censuses. The long-forms were sent to 1 in 6 households in the United States every census. In the early 2000s, it was decided that the 2010 census would be a short-form only effort and that the American Community Survey (ACS) would be used to collect long-form data which it has done since 2005. Because the ACS is a survey and not a census, cases are accumulated across years to produce estimates for small geographic units. The 2005-2009 5-Year Estimates File, from which many variables were drawn, contains data for all counties, census tracts, and block groups, including those areas with populations less than 20,000.
Researchers may request that other variables be added to the NSFG files. This is done in the NCHS Research Data Center. There are identifiers of the Census 2000 block, block group, and census tract for each respondent’s address that can be used for merging external data. For example, a researcher might add a state-level variable indicating variation in welfare provisions to the file. There are charges for the use of the RDC, which are explained at the RDC website. Please contact the RDC for instructions on adding variables.
Researchers may also find useful information for working with NSFG data through the RDC in the Series 23, Number 23 [PDF – 5.2 MB] report, or contact nsfg@cdc.gov. For more information about using NSFG contextual data and the RDC, visit the RDC website.