Cargando…

How Precise Are Our Quantitative Structure–Activity Relationship Derived Predictions for New Query Chemicals?

[Image: see text] Quantitative structure–activity relationship (QSAR) models have long been used for making predictions and data gap filling in diverse fields including medicinal chemistry, predictive toxicology, environmental fate modeling, materials science, agricultural science, nanoscience, food...

Descripción completa

Detalles Bibliográficos
Autores principales:	Roy, Kunal, Ambure, Pravin, Kar, Supratik
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	American Chemical Society 2018
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6645132/ https://www.ncbi.nlm.nih.gov/pubmed/31459245 http://dx.doi.org/10.1021/acsomega.8b01647

_version_	1783437397813362688
author	Roy, Kunal Ambure, Pravin Kar, Supratik
author_facet	Roy, Kunal Ambure, Pravin Kar, Supratik
author_sort	Roy, Kunal
collection	PubMed
description	[Image: see text] Quantitative structure–activity relationship (QSAR) models have long been used for making predictions and data gap filling in diverse fields including medicinal chemistry, predictive toxicology, environmental fate modeling, materials science, agricultural science, nanoscience, food science, and so forth. Usually a QSAR model is developed based on chemical information of a properly designed training set and corresponding experimental response data while the model is validated using one or more test set(s) for which the experimental response data are available. However, it is interesting to estimate the reliability of predictions when the model is applied to a completely new data set (true external set) even when the new data points are within applicability domain (AD) of the developed model. In the present study, we have categorized the quality of predictions for the test set or true external set into three groups (good, moderate, and bad) based on absolute prediction errors. Then, we have used three criteria [(a) mean absolute error of leave-one-out predictions for 10 most close training compounds for each query molecule; (b) AD in terms of similarity based on the standardization approach; and (c) proximity of the predicted value of the query compound to the mean training response] in different weighting schemes for making a composite score of predictions. It was found that using the most frequently appearing weighting scheme 0.5–0–0.5, the composite score-based categorization showed concordance with absolute prediction error-based categorization for more than 80% test data points while working with 5 different datasets with 15 models for each set derived in three different splitting techniques. These observations were also confirmed with true external sets for another four endpoints suggesting applicability of the scheme to judge the reliability of predictions for new datasets. The scheme has been implemented in a tool “Prediction Reliability Indicator” available at http://dtclab.webs.com/software-tools and http://teqip.jdvu.ac.in/QSAR_Tools/DTCLab/, and the tool is presently valid for multiple linear regression models only.
format	Online Article Text
id	pubmed-6645132
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	American Chemical Society
record_format	MEDLINE/PubMed
spelling	pubmed-66451322019-08-27 How Precise Are Our Quantitative Structure–Activity Relationship Derived Predictions for New Query Chemicals? Roy, Kunal Ambure, Pravin Kar, Supratik ACS Omega [Image: see text] Quantitative structure–activity relationship (QSAR) models have long been used for making predictions and data gap filling in diverse fields including medicinal chemistry, predictive toxicology, environmental fate modeling, materials science, agricultural science, nanoscience, food science, and so forth. Usually a QSAR model is developed based on chemical information of a properly designed training set and corresponding experimental response data while the model is validated using one or more test set(s) for which the experimental response data are available. However, it is interesting to estimate the reliability of predictions when the model is applied to a completely new data set (true external set) even when the new data points are within applicability domain (AD) of the developed model. In the present study, we have categorized the quality of predictions for the test set or true external set into three groups (good, moderate, and bad) based on absolute prediction errors. Then, we have used three criteria [(a) mean absolute error of leave-one-out predictions for 10 most close training compounds for each query molecule; (b) AD in terms of similarity based on the standardization approach; and (c) proximity of the predicted value of the query compound to the mean training response] in different weighting schemes for making a composite score of predictions. It was found that using the most frequently appearing weighting scheme 0.5–0–0.5, the composite score-based categorization showed concordance with absolute prediction error-based categorization for more than 80% test data points while working with 5 different datasets with 15 models for each set derived in three different splitting techniques. These observations were also confirmed with true external sets for another four endpoints suggesting applicability of the scheme to judge the reliability of predictions for new datasets. The scheme has been implemented in a tool “Prediction Reliability Indicator” available at http://dtclab.webs.com/software-tools and http://teqip.jdvu.ac.in/QSAR_Tools/DTCLab/, and the tool is presently valid for multiple linear regression models only. American Chemical Society 2018-09-19 /pmc/articles/PMC6645132/ /pubmed/31459245 http://dx.doi.org/10.1021/acsomega.8b01647 Text en Copyright © 2018 American Chemical Society This is an open access article published under a Creative Commons Attribution (CC-BY) License (http://pubs.acs.org/page/policy/authorchoice_ccby_termsofuse.html) , which permits unrestricted use, distribution and reproduction in any medium, provided the author and source are cited.
spellingShingle	Roy, Kunal Ambure, Pravin Kar, Supratik How Precise Are Our Quantitative Structure–Activity Relationship Derived Predictions for New Query Chemicals?
title	How Precise Are Our Quantitative Structure–Activity Relationship Derived Predictions for New Query Chemicals?
title_full	How Precise Are Our Quantitative Structure–Activity Relationship Derived Predictions for New Query Chemicals?
title_fullStr	How Precise Are Our Quantitative Structure–Activity Relationship Derived Predictions for New Query Chemicals?
title_full_unstemmed	How Precise Are Our Quantitative Structure–Activity Relationship Derived Predictions for New Query Chemicals?
title_short	How Precise Are Our Quantitative Structure–Activity Relationship Derived Predictions for New Query Chemicals?
title_sort	how precise are our quantitative structure–activity relationship derived predictions for new query chemicals?
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6645132/ https://www.ncbi.nlm.nih.gov/pubmed/31459245 http://dx.doi.org/10.1021/acsomega.8b01647
work_keys_str_mv	AT roykunal howpreciseareourquantitativestructureactivityrelationshipderivedpredictionsfornewquerychemicals AT amburepravin howpreciseareourquantitativestructureactivityrelationshipderivedpredictionsfornewquerychemicals AT karsupratik howpreciseareourquantitativestructureactivityrelationshipderivedpredictionsfornewquerychemicals

How Precise Are Our Quantitative Structure–Activity Relationship Derived Predictions for New Query Chemicals?

Ejemplares similares