Collecting and Using Industry and Occupation Data

I&Ologo

Analyze

Things to Consider Before You Analyze Your Data

Here are important things to consider when analyzing industry and occupation data.

If you have questions while going through this process, please email us for help at NIOSHIOCoding@cdc.gov.

Prepare your dataset for analysis once you get your output from your autocoder

1. Determine if your cell sizes are large enough for a meaningful analysis.

First, perform a frequency analysis for all of the industry or occupation groups to see if the cell sizes are large enough for meaningful analysis. If they are not, collapse the industry and/or occupation codes into higher level groups.

Collapsing into higher level groups

  • Ensures sufficient cell size
  • Improves analysis
  • Protects identity of survey participants

2. If cell sizes are large enough, but there are still too many industry or occupation groups to practically analyze the data, further collapse the data into even broader groups.

Download the SAS code that can be used to collapse your data. sas icon[SAS – 18 KB] Download the txt file. txt icon[TXT – 18 KB]

Download the R code that can be used to collapse your data. Download the txt. file. txt icon[TXT – 36 KB]

Keep in mind that the level of the detail you need in your groupings depends on the purpose of your study.

If using NAICS and SOC

NAICS and SOC are hierarchical.

  • Industry codes using NAICS may be grouped into 20 two-digit codes, which represent the 20 large industry sectors.
  • Occupation codes using SOC may be grouped into 23 two-digit codes, which represent the 23 major groupings of occupations.
Analyze

If using Census codes

Collapse the Census industry and occupation codes into Simple or Detailed recode categories developed for the National Health Interview Survey (NHIS) public use data. The tool kit provides a list of these groupings and SAS and R code to create these groupings.

From 2004 through 2018

  • Detailed occupation recodes had 94 categories; Simple occupation recodes had 23 categories.
  • Detailed industry recodes had 79 categories; Simple industry recodes had 21 categories.

The simple recode categories roughly match the NAICS two-digit codes (plus an additional code for armed forces) and SOC major groups.

See the table below for examples of how NIOSH researchers have further collapsed the detailed occupation and industry recode categories into more manageable sets of even broader categories for specific analyses.

3. Select a comparison population or a denominator for your analyses.

When calculating rates and proportions, consider whether you will use data from an external or interal source as a denominator.
Some possible external sources for denominator data include (among others)

Answer These Questions When Using Denominator Data from an External Source

1a. What classification system is your data coded in? NAICS and SOC? Or Census Industry and Occupation?

1b. What classification system is your denominator data coded in? NAICS and SOC or Census Industry and Occupation?

If the denominator data was coded in a different classification system than your data, you must reclassify either the denominator data or your dataset so that they both use the same classification system. You can reclassify either data set by cross-walking to a common Classification System. Visit the Census Bureau’s Industry and Occupation Codes Lists and Crosswalks website for instructions for creating crosswalks.external icon

2a. What version is your data coded in?

2b. What version is your denominator data coded in?

All classification systems are updated periodically. The updates are done to ensure emerging occupations and industries are assigned a standardized code. Each time an update occurs, a new version is created.  Versions are referred to by the year the update occurred (e.g., Census 2002, Census 2010, etc.).

If the denominator data are coded using a different version than your data, reclassify either the denominator data or your dataset so that they both use the same version. Align the versions by cross-walking your data. Visit the Census Bureau’s Industry and Occupation Codes Lists and Crosswalks website for instructions for creating crosswalks.external icon

  1. Are there differences in how your data and the denominator data are coded (differences in versions or classification system)?
    If there are differences, you will need to cross-walk your data to standardize the differences between classification systems and versions. Visit the Census Bureau’s Industry and Occupation Codes Lists and Crosswalks website for instructions for creating crosswalks.external icon
  2. Do your denominator data and your data group the industry and occupation codes at the same level (e.g., major industry and occupation groups)?
    If the level of grouping differs between the datasets, one or both must be recoded so that the categories match.

Complete Your Analysis

Once your cell sizes are large enough for analysis and your data and the denominator data use the same classification system, version, and grouping level, you can assess your outcomes of interest using your coded and prepared industry and occupation data.

Example Studies

In the table below, we provide information about published examples of how NIOSH investigators have used industry and occupation data to analyze health, injury, exposures, fatalities, illnesses, and economic risk factors.

caption
Topic Data Source (year/s) I/O categories used Result highlights Link to publication
Workplace secondhand smoke (SHS) exposure NHIS (2015) 78 detailed industry recode categories and some specific Census industry codes that were within recode categories with high reported prevalence of SHS exposure that had adequate sample sizes (data accessed through RDC) Nonsmoking workers employed in the commercial and industrial machinery and equipment repair and maintenance industry reported the highest prevalences of any workplace SHS exposure (65.1%), whereas the construction industry had the highest reported number of exposed workers (2.9 million) Link to article

 

Low Back Pain (LBP) NHIS (2015) 22 occupation groups (military excluded) The prevalence of any LBP and work-related LBP was highest in construction and extraction occupations. Link to articleexternal icon
Health Insurance Coverage NHIS (2015) 4 broad occupational categories (see Appendix) Workers in service and farming and production occupations were least likely to have health insurance in 2010 and 2015. Link to articleexternal icon
Overdose deaths NOMS 26 occupation groups Construction occupations had the highest PMRs for drug overdose deaths and for both heroin-related and prescription opioid–related overdose deaths. The occupation groups with the highest PMRs from methadone, natural and semisynthetic opioids, and synthetic opioids other than methadone were construction, extraction (e.g., mining, oil and gas extraction), and health care practitioners. Link to articleexternal icon
Opioid prescriptions MEPS 8 occupation groups Workers in occupations at higher risk for injury and illness – including construction and extraction; farming; service; and production, transportation, and material moving occupations – were more likely to obtain opioid prescriptions.
Asthma BRFSS (2013) 21 industry groups and 23 occupation groups State-specific prevalence of current asthma was highest among workers in the information industry (18.0%) in Massachusetts and in health care support occupations (21.5%) in Michigan. Link to article
Asthma Mortality NOMS (1999–2016) U.S. Census 2000 Industry and Occupation Classification System By industry, asthma mortality was significantly elevated among males in food, beverage, and tobacco products manufacturing, other retail trade, and miscellaneous manufacturing, and among females in social assistance. By occupation, asthma mortality was significantly elevated among females in community and social services. Link to article
Workplace Smokefree policies and cessation programs TUS-CPS (tobacco use supplement-CPS): 2014–2015 21 industry groups and 23 occupation groups The proportion of indoor workers reporting 100% smoke-free varied by sociodemographic characteristics, industry, and occupation. The proportion of indoor workers reporting a 100% smoke-free policy at their workplace was highest in the education services industry and lowest in the agriculture, forestry, fishing, and hunting industry and by occupation highest proportion was in education training and library occupations (92.2%) and lowest in the farming, fishing, and forestry occupations (63.6%). Overall, 27.2% of all working adults reported having employer-offered cessation programs Link to articlepdf iconexternal icon
COPD among those who never smoked NHIS 2013–2017 21 industry groups and 23 occupation groups During 2013–2017, an estimated 2.4 million (2.2%) U.S. working adults aged ≥18 years who never smoked had COPD. The highest COPD prevalences among persons who never smoked were in the information (3.3%) and mining (3.1%) industries and office and administrative support occupation workers (3.3%). Women had higher COPD prevalences than did men. Lnk to articlepdf icon
Tobacco Use Among Working Adults NHIS 2014–2016 21 industry groups and 23 occupation groups During, 2014–2016, 22.1% currently (every day or some days) used any form of tobacco product; 15.4% used cigarettes, 5.8% used other combustible tobacco products, 3.0% used smokeless tobacco, and 3.6% used electronic cigarettes; overall, 4.6% used two or more tobacco products   among workers. By industry, any tobacco product use ranged from 11.0% among education services to 34.3% among construction workers; use of two or more tobacco products was highest among construction industry workers. By occupation, any tobacco use ranged from 9.3% among life, physical, and social science workers to 37.2% among installation, maintenance, and repair workers; use of two or more tobacco products was highest among installation, maintenance, and repair workers. Link to articlepdf icon
Tobacco product use among workers in the construction industry NHIS 2014–2016 Major industry code“04”was used to identify workers in the construction industry. Seven categories of construction occupations within the sector was identified: management; office, and administrative support; supervisors, construction, and extraction trade; installation, maintenance,and repair; production, transportation, warehousing, and repair; and allother construction workers Over one-third of U.S. construction workers use some form of tobaccoproduct, and use varies by worker and workplace characteristics. An estimated 43% of workers in the installation, maintainence and repair occupations used some form of tobacco products Link to articleexternal icon
Airflow obstruction NHANES 2007-2012 264 detailed industry codes recoded into 44 industry groups and 501 detailed occupation codes recoded into 57 occupation groups (detailed data accessed through the Research Data Center) High airflow obstruction prevalence and significant PORs were reported in mining; manufacturing; construction; and services to buildings industries as well as extraction; bookbinders, prepress, and printing; installers and repairers; and construction occupations. Link to articleexternal icon
Page last reviewed: June 12, 2020