Collecting and Using Industry and Occupation Data
Things to Consider Before You Analyze Your Data
Here are important things to consider when analyzing industry and occupation data.
If you have questions while going through this process, please email us for help at NIOSHIOCoding@cdc.gov.
Prepare your dataset for analysis once you get your output from your autocoder
1. Determine if your cell sizes are large enough for a meaningful analysis.
First, perform a frequency analysis for all of the industry or occupation groups to see if the cell sizes are large enough for meaningful analysis. If they are not, collapse the industry and/or occupation codes into higher level groups.
Collapsing into higher level groups
- Ensures sufficient cell size
- Improves analysis
- Protects identity of survey participants
2. If cell sizes are large enough, but there are still too many industry or occupation groups to practically analyze the data, further collapse the data into even broader groups.
Keep in mind that the level of the detail you need in your groupings depends on the purpose of your study.
If using NAICS and SOC
NAICS and SOC are hierarchical.
- Industry codes using NAICS may be grouped into 20 two-digit codes, which represent the 20 large industry sectors.
- Occupation codes using SOC may be grouped into 23 two-digit codes, which represent the 23 major groupings of occupations.
If using Census codes
Collapse the Census industry and occupation codes into Simple or Detailed recode categories developed for the National Health Interview Survey (NHIS) public use data. The tool kit provides a list of these groupings and SAS and R code to create these groupings.
From 2004 through 2018
- Detailed occupation recodes had 94 categories; Simple occupation recodes had 23 categories.
- Detailed industry recodes had 79 categories; Simple industry recodes had 21 categories.
The simple recode categories roughly match the NAICS two-digit codes (plus an additional code for armed forces) and SOC major groups.
See the table below for examples of how NIOSH researchers have further collapsed the detailed occupation and industry recode categories into more manageable sets of even broader categories for specific analyses.
3. Select a comparison population or a denominator for your analyses.
When calculating rates and proportions, consider whether you will use data from an external or internal source as a denominator.
Some possible external sources for denominator data include (among others)
- American Community Survey (ACS)external icon
Data from the ACS are available for national and individual lower-level geographies through data.census.govexternal icon.
If you need specific categories defined in your denominator, the ACS Public Use Microdata Sample (PUMS)external icon files can be used to produce custom population estimates.
- Current Population Survey (CPS)external icon
The CPS is a large, complex monthly survey. Creating annual or custom population estimates from the CPS dataset requires advanced analytical skills. If you want to use CPS data for denominators, the NIOSH Employed Labor Force (ELF) query system can be used to easily generate CPS-based data tables.
Answer These Questions When Using Denominator Data from an External Source
1a. What classification system is your data coded in? NAICS and SOC? Or Census Industry and Occupation?
1b. What classification system is your denominator data coded in? NAICS and SOC or Census Industry and Occupation?
If the denominator data was coded in a different classification system than your data, you must reclassify either the denominator data or your dataset so that they both use the same classification system. You can reclassify either data set by cross-walking to a common Classification System. Visit the Census Bureau’s Industry and Occupation Codes Lists and Crosswalks website for instructions for creating crosswalks.external icon
2a. What version is your data coded in?
2b. What version is your denominator data coded in?
All classification systems are updated periodically. The updates are done to ensure emerging occupations and industries are assigned a standardized code. Each time an update occurs, a new version is created. Versions are referred to by the year the update occurred (e.g., Census 2002, Census 2010, etc.).
If the denominator data are coded using a different version than your data, reclassify either the denominator data or your dataset so that they both use the same version. Align the versions by cross-walking your data. Visit the Census Bureau’s Industry and Occupation Codes Lists and Crosswalks website for instructions for creating crosswalks.external icon
- Are there differences in how your data and the denominator data are coded (differences in versions or classification system)?
If there are differences, you will need to cross-walk your data to standardize the differences between classification systems and versions. Visit the Census Bureau’s Industry and Occupation Codes Lists and Crosswalks website for instructions for creating crosswalks.external icon
- Do your denominator data and your data group the industry and occupation codes at the same level (e.g., major industry and occupation groups)?
If the level of grouping differs between the datasets, one or both must be recoded so that the categories match.
Complete Your Analysis
Once your cell sizes are large enough for analysis and your data and the denominator data use the same classification system, version, and grouping level, you can assess your outcomes of interest using your coded and prepared industry and occupation data.
In the table below, we provide information about published examples of how NIOSH investigators have used industry and occupation data to analyze health, injury, exposures, fatalities, illnesses, and economic risk factors.
|Topic||Data Source (year/s)||I/O categories used||Result highlights||Link to publication|
|Workplace secondhand smoke (SHS) exposure||NHIS (2015)||78 detailed industry recode categories and some specific Census industry codes that were within recode categories with high reported prevalence of SHS exposure that had adequate sample sizes (data accessed through RDC)||Nonsmoking workers employed in the commercial and industrial machinery and equipment repair and maintenance industry reported the highest prevalences of any workplace SHS exposure (65.1%), whereas the construction industry had the highest reported number of exposed workers (2.9 million)||Link to article
|Low Back Pain (LBP)||NHIS (2015)||22 occupation groups (military excluded)||The prevalence of any LBP and work-related LBP was highest in construction and extraction occupations.||Link to articleexternal icon|
|Health Insurance Coverage||NHIS (2015)||4 broad occupational categories (see Appendix)||Workers in service and farming and production occupations were least likely to have health insurance in 2010 and 2015.||Link to articleexternal icon|
|Overdose deaths||NOMS||26 occupation groups||Construction occupations had the highest PMRs for drug overdose deaths and for both heroin-related and prescription opioid–related overdose deaths. The occupation groups with the highest PMRs from methadone, natural and semisynthetic opioids, and synthetic opioids other than methadone were construction, extraction (e.g., mining, oil and gas extraction), and health care practitioners.||Link to articleexternal icon|
|Opioid prescriptions||MEPS||8 occupation groups||Workers in occupations at higher risk for injury and illness – including construction and extraction; farming; service; and production, transportation, and material moving occupations – were more likely to obtain opioid prescriptions.|
|Asthma||BRFSS (2013)||21 industry groups and 23 occupation groups||State-specific prevalence of current asthma was highest among workers in the information industry (18.0%) in Massachusetts and in health care support occupations (21.5%) in Michigan.||Link to article|
|Asthma Mortality||NOMS (1999–2016)||U.S. Census 2000 Industry and Occupation Classification System||By industry, asthma mortality was significantly elevated among males in food, beverage, and tobacco products manufacturing, other retail trade, and miscellaneous manufacturing, and among females in social assistance. By occupation, asthma mortality was significantly elevated among females in community and social services.||Link to article|
|Workplace Smokefree policies and cessation programs||TUS-CPS (tobacco use supplement-CPS): 2014–2015||21 industry groups and 23 occupation groups||The proportion of indoor workers reporting 100% smoke-free varied by sociodemographic characteristics, industry, and occupation. The proportion of indoor workers reporting a 100% smoke-free policy at their workplace was highest in the education services industry and lowest in the agriculture, forestry, fishing, and hunting industry and by occupation highest proportion was in education training and library occupations (92.2%) and lowest in the farming, fishing, and forestry occupations (63.6%). Overall, 27.2% of all working adults reported having employer-offered cessation programs||Link to articlepdf iconexternal icon|
|COPD among those who never smoked||NHIS 2013–2017||21 industry groups and 23 occupation groups||During 2013–2017, an estimated 2.4 million (2.2%) U.S. working adults aged ≥18 years who never smoked had COPD. The highest COPD prevalences among persons who never smoked were in the information (3.3%) and mining (3.1%) industries and office and administrative support occupation workers (3.3%). Women had higher COPD prevalences than did men.||Lnk to articlepdf icon|
|Tobacco Use Among Working Adults||NHIS 2014–2016||21 industry groups and 23 occupation groups||During, 2014–2016, 22.1% currently (every day or some days) used any form of tobacco product; 15.4% used cigarettes, 5.8% used other combustible tobacco products, 3.0% used smokeless tobacco, and 3.6% used electronic cigarettes; overall, 4.6% used two or more tobacco products among workers. By industry, any tobacco product use ranged from 11.0% among education services to 34.3% among construction workers; use of two or more tobacco products was highest among construction industry workers. By occupation, any tobacco use ranged from 9.3% among life, physical, and social science workers to 37.2% among installation, maintenance, and repair workers; use of two or more tobacco products was highest among installation, maintenance, and repair workers.||Link to articlepdf icon|
|Tobacco product use among workers in the construction industry||NHIS 2014–2016||Major industry code“04”was used to identify workers in the construction industry. Seven categories of construction occupations within the sector was identified: management; office, and administrative support; supervisors, construction, and extraction trade; installation, maintenance, and repair; production, transportation, warehousing, and repair; and all other construction workers||Over one-third of U.S. construction workers use some form of tobacco product, and use varies by worker and workplace characteristics. An estimated 43% of workers in the installation, maintenance and repair occupations used some form of tobacco products||Link to articleexternal icon|
|Airflow obstruction||NHANES 2007-2012||264 detailed industry codes recoded into 44 industry groups and 501 detailed occupation codes recoded into 57 occupation groups (detailed data accessed through the Research Data Center)||High airflow obstruction prevalence and significant PORs were reported in mining; manufacturing; construction; and services to buildings industries as well as extraction; bookbinders, prepress, and printing; installers and repairers; and construction occupations.||Link to articleexternal icon|