Cargando…

Artificial intelligence exceeds humans in epidemiological job coding

BACKGROUND: Work circumstances can substantially negatively impact health. To explore this, large occupational cohorts of free-text job descriptions are manually coded and linked to exposure. Although several automatic coding tools have been developed, accurate exposure assessment is only feasible w...

Descripción completa

Detalles Bibliográficos
Autores principales: Langezaal, Mathijs A., van den Broek, Egon L., Peters, Susan, Goldberg, Marcel, Rey, Grégoire, Friesen, Melissa C., Locke, Sarah J., Rothman, Nathaniel, Lan, Qing, Vermeulen, Roel C. H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10625577/
https://www.ncbi.nlm.nih.gov/pubmed/37925519
http://dx.doi.org/10.1038/s43856-023-00397-4
_version_ 1785131162099777536
author Langezaal, Mathijs A.
van den Broek, Egon L.
Peters, Susan
Goldberg, Marcel
Rey, Grégoire
Friesen, Melissa C.
Locke, Sarah J.
Rothman, Nathaniel
Lan, Qing
Vermeulen, Roel C. H.
author_facet Langezaal, Mathijs A.
van den Broek, Egon L.
Peters, Susan
Goldberg, Marcel
Rey, Grégoire
Friesen, Melissa C.
Locke, Sarah J.
Rothman, Nathaniel
Lan, Qing
Vermeulen, Roel C. H.
author_sort Langezaal, Mathijs A.
collection PubMed
description BACKGROUND: Work circumstances can substantially negatively impact health. To explore this, large occupational cohorts of free-text job descriptions are manually coded and linked to exposure. Although several automatic coding tools have been developed, accurate exposure assessment is only feasible with human intervention. METHODS: We developed OPERAS, a customizable decision support system for epidemiological job coding. Using 812,522 entries, we developed and tested classification models for the Professions et Catégories Socioprofessionnelles (PCS)2003, Nomenclature d’Activités Française (NAF)2008, International Standard Classifications of Occupation (ISCO)-88, and ISCO-68. Each code comes with an estimated correctness measure to identify instances potentially requiring expert review. Here, OPERAS’ decision support enables an increase in efficiency and accuracy of the coding process through code suggestions. Using the Formaldehyde, Silica, ALOHA, and DOM job-exposure matrices, we assessed the classification models’ exposure assessment accuracy. RESULTS: We show that, using expert-coded job descriptions as gold standard, OPERAS realized a 0.66–0.84, 0.62–0.81, 0.60–0.79, and 0.57–0.78 inter-coder reliability (in Cohen’s Kappa) on the first, second, third, and fourth coding levels, respectively. These exceed the respective inter-coder reliability of expert coders ranging 0.59–0.76, 0.56–0.71, 0.46–0.63, 0.40–0.56 on the same levels, enabling a 75.0–98.4% exposure assessment accuracy and an estimated 19.7–55.7% minimum workload reduction. CONCLUSIONS: OPERAS secures a high degree of accuracy in occupational classification and exposure assessment of free-text job descriptions, substantially reducing workload. As such, OPERAS significantly outperforms both expert coders and other current coding tools. This enables large-scale, efficient, and effective exposure assessment securing healthy work conditions.
format Online
Article
Text
id pubmed-10625577
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-106255772023-11-06 Artificial intelligence exceeds humans in epidemiological job coding Langezaal, Mathijs A. van den Broek, Egon L. Peters, Susan Goldberg, Marcel Rey, Grégoire Friesen, Melissa C. Locke, Sarah J. Rothman, Nathaniel Lan, Qing Vermeulen, Roel C. H. Commun Med (Lond) Article BACKGROUND: Work circumstances can substantially negatively impact health. To explore this, large occupational cohorts of free-text job descriptions are manually coded and linked to exposure. Although several automatic coding tools have been developed, accurate exposure assessment is only feasible with human intervention. METHODS: We developed OPERAS, a customizable decision support system for epidemiological job coding. Using 812,522 entries, we developed and tested classification models for the Professions et Catégories Socioprofessionnelles (PCS)2003, Nomenclature d’Activités Française (NAF)2008, International Standard Classifications of Occupation (ISCO)-88, and ISCO-68. Each code comes with an estimated correctness measure to identify instances potentially requiring expert review. Here, OPERAS’ decision support enables an increase in efficiency and accuracy of the coding process through code suggestions. Using the Formaldehyde, Silica, ALOHA, and DOM job-exposure matrices, we assessed the classification models’ exposure assessment accuracy. RESULTS: We show that, using expert-coded job descriptions as gold standard, OPERAS realized a 0.66–0.84, 0.62–0.81, 0.60–0.79, and 0.57–0.78 inter-coder reliability (in Cohen’s Kappa) on the first, second, third, and fourth coding levels, respectively. These exceed the respective inter-coder reliability of expert coders ranging 0.59–0.76, 0.56–0.71, 0.46–0.63, 0.40–0.56 on the same levels, enabling a 75.0–98.4% exposure assessment accuracy and an estimated 19.7–55.7% minimum workload reduction. CONCLUSIONS: OPERAS secures a high degree of accuracy in occupational classification and exposure assessment of free-text job descriptions, substantially reducing workload. As such, OPERAS significantly outperforms both expert coders and other current coding tools. This enables large-scale, efficient, and effective exposure assessment securing healthy work conditions. Nature Publishing Group UK 2023-11-04 /pmc/articles/PMC10625577/ /pubmed/37925519 http://dx.doi.org/10.1038/s43856-023-00397-4 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Langezaal, Mathijs A.
van den Broek, Egon L.
Peters, Susan
Goldberg, Marcel
Rey, Grégoire
Friesen, Melissa C.
Locke, Sarah J.
Rothman, Nathaniel
Lan, Qing
Vermeulen, Roel C. H.
Artificial intelligence exceeds humans in epidemiological job coding
title Artificial intelligence exceeds humans in epidemiological job coding
title_full Artificial intelligence exceeds humans in epidemiological job coding
title_fullStr Artificial intelligence exceeds humans in epidemiological job coding
title_full_unstemmed Artificial intelligence exceeds humans in epidemiological job coding
title_short Artificial intelligence exceeds humans in epidemiological job coding
title_sort artificial intelligence exceeds humans in epidemiological job coding
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10625577/
https://www.ncbi.nlm.nih.gov/pubmed/37925519
http://dx.doi.org/10.1038/s43856-023-00397-4
work_keys_str_mv AT langezaalmathijsa artificialintelligenceexceedshumansinepidemiologicaljobcoding
AT vandenbroekegonl artificialintelligenceexceedshumansinepidemiologicaljobcoding
AT peterssusan artificialintelligenceexceedshumansinepidemiologicaljobcoding
AT goldbergmarcel artificialintelligenceexceedshumansinepidemiologicaljobcoding
AT reygregoire artificialintelligenceexceedshumansinepidemiologicaljobcoding
AT friesenmelissac artificialintelligenceexceedshumansinepidemiologicaljobcoding
AT lockesarahj artificialintelligenceexceedshumansinepidemiologicaljobcoding
AT rothmannathaniel artificialintelligenceexceedshumansinepidemiologicaljobcoding
AT lanqing artificialintelligenceexceedshumansinepidemiologicaljobcoding
AT vermeulenroelch artificialintelligenceexceedshumansinepidemiologicaljobcoding