Cargando…
Methodology of aiQSAR: a group-specific approach to QSAR modelling
BACKGROUND: Several QSAR methodology developments have shown promise in recent years. These include the consensus approach to generate the final prediction of a model, utilizing new, advanced machine learning algorithms and streamlining, standardization and automation of various QSAR steps. One appr...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446381/ https://www.ncbi.nlm.nih.gov/pubmed/30945010 http://dx.doi.org/10.1186/s13321-019-0350-y |
_version_ | 1783408354033401856 |
---|---|
author | Vukovic, Kristijan Gadaleta, Domenico Benfenati, Emilio |
author_facet | Vukovic, Kristijan Gadaleta, Domenico Benfenati, Emilio |
author_sort | Vukovic, Kristijan |
collection | PubMed |
description | BACKGROUND: Several QSAR methodology developments have shown promise in recent years. These include the consensus approach to generate the final prediction of a model, utilizing new, advanced machine learning algorithms and streamlining, standardization and automation of various QSAR steps. One approach that seems under-explored is at-the-runtime generation of local models specific to individual compounds. This approach was quite likely limited by the computational requirements, but with current increases in processing power and the widespread availability of cluster-computing infrastructure, this limitation is no longer that severe. RESULTS: We propose a new QSAR methodology: aiQSAR, whose aim is to generate endpoint predictions directly from the input dataset by building an array of local models generated at-the-runtime and specific for each compound in the dataset. The local group of each compound is selected on the basis of fingerprint similarities and the final prediction is calculated by integrating the results of a number of autonomous mathematical models. The method is applicable to regression, binary classification and multi-class classification and was tested on one dataset for each endpoint type: bioconcentration factor (BCF) for regression, Ames test for binary classification and Environmental Protection Agency (EPA) acute rat oral toxicity ranking for multi-class classification. As part of this method, the applicability domain of each prediction is assessed through the applicability domain measure, calculated on the basis of the fingerprint similarities in each local group of compounds. CONCLUSIONS: We outline the methodology for a new QSAR-based predictive tool whose advantages are automation, group-specific approach to modelling and simplicity of execution. Our aim now will be to develop this method into a stand-alone software tool. We hope that eventual adoption of our tool would make QSAR modelling more accessible and transparent. Our methodology could be used as an initial modelling step, to predict new compounds by simply loading the training dataset as an input. Predictions could then be further evaluated and refined either by other tools or through optimization of aiQSAR parameters. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13321-019-0350-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6446381 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-64463812019-04-15 Methodology of aiQSAR: a group-specific approach to QSAR modelling Vukovic, Kristijan Gadaleta, Domenico Benfenati, Emilio J Cheminform Methodology BACKGROUND: Several QSAR methodology developments have shown promise in recent years. These include the consensus approach to generate the final prediction of a model, utilizing new, advanced machine learning algorithms and streamlining, standardization and automation of various QSAR steps. One approach that seems under-explored is at-the-runtime generation of local models specific to individual compounds. This approach was quite likely limited by the computational requirements, but with current increases in processing power and the widespread availability of cluster-computing infrastructure, this limitation is no longer that severe. RESULTS: We propose a new QSAR methodology: aiQSAR, whose aim is to generate endpoint predictions directly from the input dataset by building an array of local models generated at-the-runtime and specific for each compound in the dataset. The local group of each compound is selected on the basis of fingerprint similarities and the final prediction is calculated by integrating the results of a number of autonomous mathematical models. The method is applicable to regression, binary classification and multi-class classification and was tested on one dataset for each endpoint type: bioconcentration factor (BCF) for regression, Ames test for binary classification and Environmental Protection Agency (EPA) acute rat oral toxicity ranking for multi-class classification. As part of this method, the applicability domain of each prediction is assessed through the applicability domain measure, calculated on the basis of the fingerprint similarities in each local group of compounds. CONCLUSIONS: We outline the methodology for a new QSAR-based predictive tool whose advantages are automation, group-specific approach to modelling and simplicity of execution. Our aim now will be to develop this method into a stand-alone software tool. We hope that eventual adoption of our tool would make QSAR modelling more accessible and transparent. Our methodology could be used as an initial modelling step, to predict new compounds by simply loading the training dataset as an input. Predictions could then be further evaluated and refined either by other tools or through optimization of aiQSAR parameters. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13321-019-0350-y) contains supplementary material, which is available to authorized users. Springer International Publishing 2019-04-03 /pmc/articles/PMC6446381/ /pubmed/30945010 http://dx.doi.org/10.1186/s13321-019-0350-y Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Vukovic, Kristijan Gadaleta, Domenico Benfenati, Emilio Methodology of aiQSAR: a group-specific approach to QSAR modelling |
title | Methodology of aiQSAR: a group-specific approach to QSAR modelling |
title_full | Methodology of aiQSAR: a group-specific approach to QSAR modelling |
title_fullStr | Methodology of aiQSAR: a group-specific approach to QSAR modelling |
title_full_unstemmed | Methodology of aiQSAR: a group-specific approach to QSAR modelling |
title_short | Methodology of aiQSAR: a group-specific approach to QSAR modelling |
title_sort | methodology of aiqsar: a group-specific approach to qsar modelling |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446381/ https://www.ncbi.nlm.nih.gov/pubmed/30945010 http://dx.doi.org/10.1186/s13321-019-0350-y |
work_keys_str_mv | AT vukovickristijan methodologyofaiqsaragroupspecificapproachtoqsarmodelling AT gadaletadomenico methodologyofaiqsaragroupspecificapproachtoqsarmodelling AT benfenatiemilio methodologyofaiqsaragroupspecificapproachtoqsarmodelling |