Cargando…

Methodology of aiQSAR: a group-specific approach to QSAR modelling

BACKGROUND: Several QSAR methodology developments have shown promise in recent years. These include the consensus approach to generate the final prediction of a model, utilizing new, advanced machine learning algorithms and streamlining, standardization and automation of various QSAR steps. One appr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Vukovic, Kristijan, Gadaleta, Domenico, Benfenati, Emilio
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer International Publishing 2019
Materias:	Methodology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446381/ https://www.ncbi.nlm.nih.gov/pubmed/30945010 http://dx.doi.org/10.1186/s13321-019-0350-y

_version_	1783408354033401856
author	Vukovic, Kristijan Gadaleta, Domenico Benfenati, Emilio
author_facet	Vukovic, Kristijan Gadaleta, Domenico Benfenati, Emilio
author_sort	Vukovic, Kristijan
collection	PubMed
description	BACKGROUND: Several QSAR methodology developments have shown promise in recent years. These include the consensus approach to generate the final prediction of a model, utilizing new, advanced machine learning algorithms and streamlining, standardization and automation of various QSAR steps. One approach that seems under-explored is at-the-runtime generation of local models specific to individual compounds. This approach was quite likely limited by the computational requirements, but with current increases in processing power and the widespread availability of cluster-computing infrastructure, this limitation is no longer that severe. RESULTS: We propose a new QSAR methodology: aiQSAR, whose aim is to generate endpoint predictions directly from the input dataset by building an array of local models generated at-the-runtime and specific for each compound in the dataset. The local group of each compound is selected on the basis of fingerprint similarities and the final prediction is calculated by integrating the results of a number of autonomous mathematical models. The method is applicable to regression, binary classification and multi-class classification and was tested on one dataset for each endpoint type: bioconcentration factor (BCF) for regression, Ames test for binary classification and Environmental Protection Agency (EPA) acute rat oral toxicity ranking for multi-class classification. As part of this method, the applicability domain of each prediction is assessed through the applicability domain measure, calculated on the basis of the fingerprint similarities in each local group of compounds. CONCLUSIONS: We outline the methodology for a new QSAR-based predictive tool whose advantages are automation, group-specific approach to modelling and simplicity of execution. Our aim now will be to develop this method into a stand-alone software tool. We hope that eventual adoption of our tool would make QSAR modelling more accessible and transparent. Our methodology could be used as an initial modelling step, to predict new compounds by simply loading the training dataset as an input. Predictions could then be further evaluated and refined either by other tools or through optimization of aiQSAR parameters. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13321-019-0350-y) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-6446381
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Springer International Publishing
record_format	MEDLINE/PubMed
spelling	pubmed-64463812019-04-15 Methodology of aiQSAR: a group-specific approach to QSAR modelling Vukovic, Kristijan Gadaleta, Domenico Benfenati, Emilio J Cheminform Methodology BACKGROUND: Several QSAR methodology developments have shown promise in recent years. These include the consensus approach to generate the final prediction of a model, utilizing new, advanced machine learning algorithms and streamlining, standardization and automation of various QSAR steps. One approach that seems under-explored is at-the-runtime generation of local models specific to individual compounds. This approach was quite likely limited by the computational requirements, but with current increases in processing power and the widespread availability of cluster-computing infrastructure, this limitation is no longer that severe. RESULTS: We propose a new QSAR methodology: aiQSAR, whose aim is to generate endpoint predictions directly from the input dataset by building an array of local models generated at-the-runtime and specific for each compound in the dataset. The local group of each compound is selected on the basis of fingerprint similarities and the final prediction is calculated by integrating the results of a number of autonomous mathematical models. The method is applicable to regression, binary classification and multi-class classification and was tested on one dataset for each endpoint type: bioconcentration factor (BCF) for regression, Ames test for binary classification and Environmental Protection Agency (EPA) acute rat oral toxicity ranking for multi-class classification. As part of this method, the applicability domain of each prediction is assessed through the applicability domain measure, calculated on the basis of the fingerprint similarities in each local group of compounds. CONCLUSIONS: We outline the methodology for a new QSAR-based predictive tool whose advantages are automation, group-specific approach to modelling and simplicity of execution. Our aim now will be to develop this method into a stand-alone software tool. We hope that eventual adoption of our tool would make QSAR modelling more accessible and transparent. Our methodology could be used as an initial modelling step, to predict new compounds by simply loading the training dataset as an input. Predictions could then be further evaluated and refined either by other tools or through optimization of aiQSAR parameters. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13321-019-0350-y) contains supplementary material, which is available to authorized users. Springer International Publishing 2019-04-03 /pmc/articles/PMC6446381/ /pubmed/30945010 http://dx.doi.org/10.1186/s13321-019-0350-y Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Vukovic, Kristijan Gadaleta, Domenico Benfenati, Emilio Methodology of aiQSAR: a group-specific approach to QSAR modelling
title	Methodology of aiQSAR: a group-specific approach to QSAR modelling
title_full	Methodology of aiQSAR: a group-specific approach to QSAR modelling
title_fullStr	Methodology of aiQSAR: a group-specific approach to QSAR modelling
title_full_unstemmed	Methodology of aiQSAR: a group-specific approach to QSAR modelling
title_short	Methodology of aiQSAR: a group-specific approach to QSAR modelling
title_sort	methodology of aiqsar: a group-specific approach to qsar modelling
topic	Methodology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446381/ https://www.ncbi.nlm.nih.gov/pubmed/30945010 http://dx.doi.org/10.1186/s13321-019-0350-y
work_keys_str_mv	AT vukovickristijan methodologyofaiqsaragroupspecificapproachtoqsarmodelling AT gadaletadomenico methodologyofaiqsaragroupspecificapproachtoqsarmodelling AT benfenatiemilio methodologyofaiqsaragroupspecificapproachtoqsarmodelling

Methodology of aiQSAR: a group-specific approach to QSAR modelling

Ejemplares similares