Modernizing the coding of occupational health data
- Experts at CDC’s National Institute for Occupational Safety and Health (NIOSH) developed the NIOSH Industry and Occupation Computerized Coding System (NIOCCS). NIOCCS is a web-based application created for public health professionals. This free application codes industry and occupation text quickly and accurately.
- Previously, public health professionals used less efficient manual processes to assign standardized codes to occupational health data and to analyze health trends in the workplace. These data come from case report forms, surveys, and other records containing industry and occupation information.
- By leveraging machine learning technologies, NIOCCS provides state, tribal, local, and territorial health departments with a modern solution that they can use when analyzing how people’s jobs impact their health and safety.
CDC’s National Institute for Occupational Safety and Health (NIOSH) developed the NIOSH Industry and Occupation Computerized Coding System (NIOCCS). NIOCCS is a free web-based application to help public health professionals code industry and occupation text quickly and accurately.
This application provides state, tribal, local, and territorial health departments with a modern solution that they can use when analyzing how people’s jobs impact their health and safety. By leveraging new technologies to streamline outdated approaches to coding occupational health data, NIOCCS is an important step toward modernizing public health data systems.
Public Health Problem
To protect people’s health in the workplace, public health professionals need to understand the risks and exposures workers experience. Public health investigators and researchers need standard industry and occupation codes to examine illnesses and injuries among specific jobs and industries.
As part of public health investigations, forms and surveys may ask questions such as “What is your job?” and “What type of business do you work in?” Respondents may be given a blank line to fill in their job information:
- “Registered nurse” working in a hospital
- “Chef” working in a restaurant
- “Mechanic” working in an autobody shop
To analyze health trends and inform public health action, public health professionals and researchers convert these text descriptions into standardized numeric codes. Similar to how a ZIP code identifies a specific geographic area, industry and occupation codes identify specific industries or groups of workers.
For example, though teachers and teaching assistants are both jobs in the education industry, they have different job codes. If the wrong code is assigned, an analysis may incorrectly show that one group has a higher (or lower) risk of illness than the other. Learn more about standard codes.
Before software applications such as NIOCCS were available, occupational health specialists and researchers had to manually look up codes for each person’s response. Analyzing large amounts of information was a daunting task. The manual approach to coding was slow and costly. Manual coding made it difficult to analyze and share timely, high-quality data to help detect, prevent, and respond to emerging health threats in the workplace.
To meet this pressing need, NIOSH developed NIOCCS. This web-based application allows public health professionals to automatically code, or autocode, industry and occupation text. NIOCCS uses machine learning to determine the appropriate industry or occupation code. The application assigns a standard code to every industry and occupation record uploaded.
NIOCCS offers a suite of options for coding data:
- Code a single record
- Upload a file (this requires the user to create a free account, but is a great option for large amounts of data that are already collected)
- Code data as it is collected using the NIOCCS Web API (Application Programming Interface)
State and local jurisdictions have been using NIOCCS since the beta version was created in 2012. When NIOSH incorporated machine learning in 2021, use of the application rapidly increased. To date over 150 million records have been coded using the NIOCCS application. Some of the reasons for this include:
- Results are more accurate and consistent because of the machine learning platform.
- Tens of thousands of records can be coded in minutes.
- All records receive an industry and occupation code from the autocoder.
Learn how a few jurisdictions use NIOCCS to advance their public health mission.
“We use NIOCCS to gather real-time coded industry and occupation data on various communicable disease interview forms (including COVID-19 forms). We also use the file coder to code large historical datasets.”
“Where employment data would previously have been coded manually or assigned via limited dropdown menus, NIOCCS allows us to quickly assign detailed North American Industry Classification System and Standard Occupational Classification codes—enabling efficient analysis of large datasets. In turn, this makes rapid dissemination of important workplace measures to local and tribal health departments possible.”
“The California Department of Public Health’s (CDPH) Occupational Health Branch (OHB) used NIOCCS extensively during our COVID-19 response. We coded industry and occupation (I/O) for weekly fatalities, collected from workplace outbreak and survey data. The NIOCCS web application automated the processing of large weekly datasets with improved machine learning-based quality. We manually reviewed only the lowest probability results. Together these improvements enabled a quick COVID-19 response, reduced the team training, and modernized high throughput, high-quality I/O, simply not possible through manual coding alone.”
“NIOCCS has transformed the field of occupational surveillance and epidemiology and its latest versions have accelerated CDPH OHB efforts to characterize the California workplace COVID-19 burden for prevention of worker exposures, illnesses, and deaths.”
“Minnesota uses NIOCCS to code industry and occupation fields in our death certificate data file to get up-to-date information to analyze suicides by industry and occupation. Free use of the automated system helped us obtain much more timely data. This facilitates our ongoing updates of data on farmer suicides and helped us begin to further explore suicide by industry and occupation.”
“Minnesota also used NIOCCS in 2021–2022 to code a subset of COVID-19 case data to see if we could use the sporadic job information in the data file to study how workers in various industries and occupations experienced COVID-19. We performed simple quality analysis on the coded data to identify an appropriate cutoff level in the confidence value for accepting or rejecting the codes. Free use of the automated coding system enabled the use of the very large COVID-19 data file. The size of this file would prevent manual coding for anything more than a small sample of the records. Using the entire data file helps us provide a fuller picture of COVID-19 in Minnesota workers.”
Learn more about NIOCCS platform and the suite of options available to begin coding your data!
Contact the NIOCCS team if you have questions or need help coding industry and occupation data.
Check out other NIOSH resources for collecting and using industry and occupation data and learn how NIOCCS has been used to inform public health action:
- Collecting and Using Industry and Occupation Data | NIOSH | CDC
- Using BRFSS to Assess Workers’ Health | NIOSH | CDC
- CDC – NIOSH Industry and Occupation Computerized Coding System (NIOCCS) – About
- eNews: Volume 18, Number 10 (February 2021) | NIOSH | CDC
- Collecting Occupation and Industry Data in Public Health Surveillance Systems for COVID-19 | Blogs | CDC
- Making Industry and Occupation Information Useful for Public Health: A guide to coding industry and occupation text fields | Blogs | CDC
- 100 Million and Counting! | Blogs | CDC
- New Data Available! Assess Causes of Death by Industry and Occupation | Blogs | CDC
- How Collecting and Analyzing COVID-19 Case Job Information Can Make a Difference in Public Health | Blogs | CDC