Cargando…

Predicting toxicity of chemicals: software beats animal testing

We created earlier a large machine‐readable database of 10,000 chemicals and 800,000 associated studies by natural language processing of the public parts of Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) registrations until December 2014. This database was used to asse...

Descripción completa

Detalles Bibliográficos
Autor principal: Hartung, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7015478/
https://www.ncbi.nlm.nih.gov/pubmed/32626447
http://dx.doi.org/10.2903/j.efsa.2019.e170710
_version_ 1783496802527346688
author Hartung, Thomas
author_facet Hartung, Thomas
author_sort Hartung, Thomas
collection PubMed
description We created earlier a large machine‐readable database of 10,000 chemicals and 800,000 associated studies by natural language processing of the public parts of Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) registrations until December 2014. This database was used to assess the reproducibility of the six most frequently used Organisation for Economic Co‐operation and Development (OECD) guideline tests. These tests consume 55% of all animals in safety testing in Europe, i.e. about 600,000 animals. With 350–750 chemicals with multiple results per test, reproducibility (balanced accuracy) was 81% and 69% of toxic substances were found again in a repeat experiment (sensitivity 69%). Inspired by the increasingly used read‐across approach, we created a new type of QSAR, which is based on similarity of chemicals and not on chemical descriptors. A landscape of the chemical universe using 10 million structures was calculated, when based on Tanimoto indices similar chemicals are close and dissimilar chemicals far from each other. This allows placing any chemical of interest into the map and evaluating the information available for surrounding chemicals. In a data fusion approach, in which 74 different properties were taken into consideration, machine learning (random forest) allowed a fivefold cross‐validation for 190,000 (non‐) hazard labels of chemicals for which nine hazards were predicted. The balanced accuracy of this approach was 87% with a sensitivity of 89%. Each prediction comes with a certainty measure based on the homogeneity of data and distance of neighbours. Ongoing developments and future opportunities are discussed.
format Online
Article
Text
id pubmed-7015478
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-70154782020-07-02 Predicting toxicity of chemicals: software beats animal testing Hartung, Thomas EFSA J Advancing Risk Assessment Science We created earlier a large machine‐readable database of 10,000 chemicals and 800,000 associated studies by natural language processing of the public parts of Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) registrations until December 2014. This database was used to assess the reproducibility of the six most frequently used Organisation for Economic Co‐operation and Development (OECD) guideline tests. These tests consume 55% of all animals in safety testing in Europe, i.e. about 600,000 animals. With 350–750 chemicals with multiple results per test, reproducibility (balanced accuracy) was 81% and 69% of toxic substances were found again in a repeat experiment (sensitivity 69%). Inspired by the increasingly used read‐across approach, we created a new type of QSAR, which is based on similarity of chemicals and not on chemical descriptors. A landscape of the chemical universe using 10 million structures was calculated, when based on Tanimoto indices similar chemicals are close and dissimilar chemicals far from each other. This allows placing any chemical of interest into the map and evaluating the information available for surrounding chemicals. In a data fusion approach, in which 74 different properties were taken into consideration, machine learning (random forest) allowed a fivefold cross‐validation for 190,000 (non‐) hazard labels of chemicals for which nine hazards were predicted. The balanced accuracy of this approach was 87% with a sensitivity of 89%. Each prediction comes with a certainty measure based on the homogeneity of data and distance of neighbours. Ongoing developments and future opportunities are discussed. John Wiley and Sons Inc. 2019-07-08 /pmc/articles/PMC7015478/ /pubmed/32626447 http://dx.doi.org/10.2903/j.efsa.2019.e170710 Text en © 2019 European Food Safety Authority. EFSA Journal published by John Wiley and Sons Ltd on behalf of European Food Safety Authority. This is an open access article under the terms of the http://creativecommons.org/licenses/by-nd/4.0/ License, which permits use and distribution in any medium, provided the original work is properly cited and no modifications or adaptations are made.
spellingShingle Advancing Risk Assessment Science
Hartung, Thomas
Predicting toxicity of chemicals: software beats animal testing
title Predicting toxicity of chemicals: software beats animal testing
title_full Predicting toxicity of chemicals: software beats animal testing
title_fullStr Predicting toxicity of chemicals: software beats animal testing
title_full_unstemmed Predicting toxicity of chemicals: software beats animal testing
title_short Predicting toxicity of chemicals: software beats animal testing
title_sort predicting toxicity of chemicals: software beats animal testing
topic Advancing Risk Assessment Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7015478/
https://www.ncbi.nlm.nih.gov/pubmed/32626447
http://dx.doi.org/10.2903/j.efsa.2019.e170710
work_keys_str_mv AT hartungthomas predictingtoxicityofchemicalssoftwarebeatsanimaltesting