Cargando…

Prioritizing cancer hazard assessments for IARC Monographs using an integrated approach of database fusion and text mining

BACKGROUND: Systematic evaluation of literature data on the cancer hazards of human exposures is an essential process underlying cancer prevention strategies. The scope and volume of evidence for suspected carcinogens can range from very few to thousands of publications, requiring a complex, systema...

Descripción completa

Detalles Bibliográficos
Autores principales: Barupal, Dinesh Kumar, Schubauer-Berigan, Mary K., Korenjak, Michael, Zavadil, Jiri, Guyton, Kathryn Z.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8380673/
https://www.ncbi.nlm.nih.gov/pubmed/33984576
http://dx.doi.org/10.1016/j.envint.2021.106624
_version_ 1783741238450585600
author Barupal, Dinesh Kumar
Schubauer-Berigan, Mary K.
Korenjak, Michael
Zavadil, Jiri
Guyton, Kathryn Z.
author_facet Barupal, Dinesh Kumar
Schubauer-Berigan, Mary K.
Korenjak, Michael
Zavadil, Jiri
Guyton, Kathryn Z.
author_sort Barupal, Dinesh Kumar
collection PubMed
description BACKGROUND: Systematic evaluation of literature data on the cancer hazards of human exposures is an essential process underlying cancer prevention strategies. The scope and volume of evidence for suspected carcinogens can range from very few to thousands of publications, requiring a complex, systematically planned, and critical procedure to nominate, prioritize and evaluate carcinogenic agents. To aid in this process, database fusion, cheminformatics and text mining techniques can be combined into an integrated approach to inform agent prioritization, selection, and grouping. RESULTS: We have applied these techniques to agents recommended for the IARC Monographs evaluations during 2020–2024. An integration of PubMed filters to cover cancer epidemiology, key characteristics of carcinogens, chemical lists from 34 databases relevant for cancer research, chemical structure grouping and a literature databased clustering was applied in an innovative approach to 119 agents recommended by an advisory group for future IARC Monographs evaluations. The approach also facilitated a rational grouping of these agents and aids in understanding the volume and complexity of relevant information, as well as important gaps in coverage of the available studies on cancer etiology and carcinogenesis. CONCLUSION: A new data-science approach has been applied to diverse agents recommended for cancer hazard assessments, and its applications for the IARC Monographs are demonstrated. The prioritization approach has been made available at www.cancer.idsl.me site for ranking cancer agents.
format Online
Article
Text
id pubmed-8380673
institution National Center for Biotechnology Information
language English
publishDate 2021
record_format MEDLINE/PubMed
spelling pubmed-83806732021-11-01 Prioritizing cancer hazard assessments for IARC Monographs using an integrated approach of database fusion and text mining Barupal, Dinesh Kumar Schubauer-Berigan, Mary K. Korenjak, Michael Zavadil, Jiri Guyton, Kathryn Z. Environ Int Article BACKGROUND: Systematic evaluation of literature data on the cancer hazards of human exposures is an essential process underlying cancer prevention strategies. The scope and volume of evidence for suspected carcinogens can range from very few to thousands of publications, requiring a complex, systematically planned, and critical procedure to nominate, prioritize and evaluate carcinogenic agents. To aid in this process, database fusion, cheminformatics and text mining techniques can be combined into an integrated approach to inform agent prioritization, selection, and grouping. RESULTS: We have applied these techniques to agents recommended for the IARC Monographs evaluations during 2020–2024. An integration of PubMed filters to cover cancer epidemiology, key characteristics of carcinogens, chemical lists from 34 databases relevant for cancer research, chemical structure grouping and a literature databased clustering was applied in an innovative approach to 119 agents recommended by an advisory group for future IARC Monographs evaluations. The approach also facilitated a rational grouping of these agents and aids in understanding the volume and complexity of relevant information, as well as important gaps in coverage of the available studies on cancer etiology and carcinogenesis. CONCLUSION: A new data-science approach has been applied to diverse agents recommended for cancer hazard assessments, and its applications for the IARC Monographs are demonstrated. The prioritization approach has been made available at www.cancer.idsl.me site for ranking cancer agents. 2021-05-10 2021-11 /pmc/articles/PMC8380673/ /pubmed/33984576 http://dx.doi.org/10.1016/j.envint.2021.106624 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) ).
spellingShingle Article
Barupal, Dinesh Kumar
Schubauer-Berigan, Mary K.
Korenjak, Michael
Zavadil, Jiri
Guyton, Kathryn Z.
Prioritizing cancer hazard assessments for IARC Monographs using an integrated approach of database fusion and text mining
title Prioritizing cancer hazard assessments for IARC Monographs using an integrated approach of database fusion and text mining
title_full Prioritizing cancer hazard assessments for IARC Monographs using an integrated approach of database fusion and text mining
title_fullStr Prioritizing cancer hazard assessments for IARC Monographs using an integrated approach of database fusion and text mining
title_full_unstemmed Prioritizing cancer hazard assessments for IARC Monographs using an integrated approach of database fusion and text mining
title_short Prioritizing cancer hazard assessments for IARC Monographs using an integrated approach of database fusion and text mining
title_sort prioritizing cancer hazard assessments for iarc monographs using an integrated approach of database fusion and text mining
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8380673/
https://www.ncbi.nlm.nih.gov/pubmed/33984576
http://dx.doi.org/10.1016/j.envint.2021.106624
work_keys_str_mv AT barupaldineshkumar prioritizingcancerhazardassessmentsforiarcmonographsusinganintegratedapproachofdatabasefusionandtextmining
AT schubauerberiganmaryk prioritizingcancerhazardassessmentsforiarcmonographsusinganintegratedapproachofdatabasefusionandtextmining
AT korenjakmichael prioritizingcancerhazardassessmentsforiarcmonographsusinganintegratedapproachofdatabasefusionandtextmining
AT zavadiljiri prioritizingcancerhazardassessmentsforiarcmonographsusinganintegratedapproachofdatabasefusionandtextmining
AT guytonkathrynz prioritizingcancerhazardassessmentsforiarcmonographsusinganintegratedapproachofdatabasefusionandtextmining