Cargando…

An ensemble model of QSAR tools for regulatory risk assessment

Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes...

Descripción completa

Detalles Bibliográficos
Autores principales: Pradeep, Prachi, Povinelli, Richard J., White, Shannon, Merrill, Stephen J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5034616/
https://www.ncbi.nlm.nih.gov/pubmed/28316646
http://dx.doi.org/10.1186/s13321-016-0164-0
_version_ 1782455307161567232
author Pradeep, Prachi
Povinelli, Richard J.
White, Shannon
Merrill, Stephen J.
author_facet Pradeep, Prachi
Povinelli, Richard J.
White, Shannon
Merrill, Stephen J.
author_sort Pradeep, Prachi
collection PubMed
description Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflicting predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa (κ): 0.63 and 0.62] for both the datasets. The ROC curves demonstrate the utility of the cut-off feature in the predictive ability of the ensemble model. This feature provides an additional control to the regulators in grading a chemical based on the severity of the toxic endpoint under study. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-016-0164-0) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5034616
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-50346162017-03-17 An ensemble model of QSAR tools for regulatory risk assessment Pradeep, Prachi Povinelli, Richard J. White, Shannon Merrill, Stephen J. J Cheminform Research Article Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflicting predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa (κ): 0.63 and 0.62] for both the datasets. The ROC curves demonstrate the utility of the cut-off feature in the predictive ability of the ensemble model. This feature provides an additional control to the regulators in grading a chemical based on the severity of the toxic endpoint under study. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13321-016-0164-0) contains supplementary material, which is available to authorized users. Springer International Publishing 2016-09-22 /pmc/articles/PMC5034616/ /pubmed/28316646 http://dx.doi.org/10.1186/s13321-016-0164-0 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Pradeep, Prachi
Povinelli, Richard J.
White, Shannon
Merrill, Stephen J.
An ensemble model of QSAR tools for regulatory risk assessment
title An ensemble model of QSAR tools for regulatory risk assessment
title_full An ensemble model of QSAR tools for regulatory risk assessment
title_fullStr An ensemble model of QSAR tools for regulatory risk assessment
title_full_unstemmed An ensemble model of QSAR tools for regulatory risk assessment
title_short An ensemble model of QSAR tools for regulatory risk assessment
title_sort ensemble model of qsar tools for regulatory risk assessment
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5034616/
https://www.ncbi.nlm.nih.gov/pubmed/28316646
http://dx.doi.org/10.1186/s13321-016-0164-0
work_keys_str_mv AT pradeepprachi anensemblemodelofqsartoolsforregulatoryriskassessment
AT povinellirichardj anensemblemodelofqsartoolsforregulatoryriskassessment
AT whiteshannon anensemblemodelofqsartoolsforregulatoryriskassessment
AT merrillstephenj anensemblemodelofqsartoolsforregulatoryriskassessment
AT pradeepprachi ensemblemodelofqsartoolsforregulatoryriskassessment
AT povinellirichardj ensemblemodelofqsartoolsforregulatoryriskassessment
AT whiteshannon ensemblemodelofqsartoolsforregulatoryriskassessment
AT merrillstephenj ensemblemodelofqsartoolsforregulatoryriskassessment