Cargando…

Intercorrelation Limits in Molecular Descriptor Preselection for QSAR/QSPR

QSAR/QSPR (quantitative structure‐activity/property relationship) modeling has been a prevalent approach in various, overlapping sub‐fields of computational, medicinal and environmental chemistry for decades. The generation and selection of molecular descriptors is an essential part of this process....

Descripción completa

Detalles Bibliográficos
Autores principales: Rácz, Anita, Bajusz, Dávid, Héberger, Károly
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6767540/
https://www.ncbi.nlm.nih.gov/pubmed/30945814
http://dx.doi.org/10.1002/minf.201800154
_version_ 1783454941613916160
author Rácz, Anita
Bajusz, Dávid
Héberger, Károly
author_facet Rácz, Anita
Bajusz, Dávid
Héberger, Károly
author_sort Rácz, Anita
collection PubMed
description QSAR/QSPR (quantitative structure‐activity/property relationship) modeling has been a prevalent approach in various, overlapping sub‐fields of computational, medicinal and environmental chemistry for decades. The generation and selection of molecular descriptors is an essential part of this process. In typical QSAR workflows, the starting pool of molecular descriptors is rationalized based on filtering out descriptors which are (i) constant throughout the whole dataset, or (ii) very strongly correlated to another descriptor. While the former is fairly straightforward, the latter involves a level of subjectivity when deciding what exactly is considered to be a strong correlation. Despite that, most QSAR modeling studies do not report on this step. In this study, we examine in detail the effect of various possible descriptor intercorrelation limits on the resulting QSAR models. Statistical comparisons are carried out based on four case studies from contemporary QSAR literature, using a combined methodology based on sum of ranking differences (SRD) and analysis of variance (ANOVA).
format Online
Article
Text
id pubmed-6767540
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-67675402019-10-03 Intercorrelation Limits in Molecular Descriptor Preselection for QSAR/QSPR Rácz, Anita Bajusz, Dávid Héberger, Károly Mol Inform Full Papers QSAR/QSPR (quantitative structure‐activity/property relationship) modeling has been a prevalent approach in various, overlapping sub‐fields of computational, medicinal and environmental chemistry for decades. The generation and selection of molecular descriptors is an essential part of this process. In typical QSAR workflows, the starting pool of molecular descriptors is rationalized based on filtering out descriptors which are (i) constant throughout the whole dataset, or (ii) very strongly correlated to another descriptor. While the former is fairly straightforward, the latter involves a level of subjectivity when deciding what exactly is considered to be a strong correlation. Despite that, most QSAR modeling studies do not report on this step. In this study, we examine in detail the effect of various possible descriptor intercorrelation limits on the resulting QSAR models. Statistical comparisons are carried out based on four case studies from contemporary QSAR literature, using a combined methodology based on sum of ranking differences (SRD) and analysis of variance (ANOVA). John Wiley and Sons Inc. 2019-04-04 2019-08 /pmc/articles/PMC6767540/ /pubmed/30945814 http://dx.doi.org/10.1002/minf.201800154 Text en © 2019 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Full Papers
Rácz, Anita
Bajusz, Dávid
Héberger, Károly
Intercorrelation Limits in Molecular Descriptor Preselection for QSAR/QSPR
title Intercorrelation Limits in Molecular Descriptor Preselection for QSAR/QSPR
title_full Intercorrelation Limits in Molecular Descriptor Preselection for QSAR/QSPR
title_fullStr Intercorrelation Limits in Molecular Descriptor Preselection for QSAR/QSPR
title_full_unstemmed Intercorrelation Limits in Molecular Descriptor Preselection for QSAR/QSPR
title_short Intercorrelation Limits in Molecular Descriptor Preselection for QSAR/QSPR
title_sort intercorrelation limits in molecular descriptor preselection for qsar/qspr
topic Full Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6767540/
https://www.ncbi.nlm.nih.gov/pubmed/30945814
http://dx.doi.org/10.1002/minf.201800154
work_keys_str_mv AT raczanita intercorrelationlimitsinmoleculardescriptorpreselectionforqsarqspr
AT bajuszdavid intercorrelationlimitsinmoleculardescriptorpreselectionforqsarqspr
AT hebergerkaroly intercorrelationlimitsinmoleculardescriptorpreselectionforqsarqspr