Cargando…

Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets

The screening of compounds for ADME-Tox targets plays an important role in drug design. QSPR models can increase the speed of these specific tasks, although the performance of the models highly depends on several factors, such as the applied molecular descriptors. In this study, a detailed compariso...

Descripción completa

Detalles Bibliográficos
Autores principales: Orosz, Álmos, Héberger, Károly, Rácz, Anita
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9214226/
https://www.ncbi.nlm.nih.gov/pubmed/35755260
http://dx.doi.org/10.3389/fchem.2022.852893
_version_ 1784730968282628096
author Orosz, Álmos
Héberger, Károly
Rácz, Anita
author_facet Orosz, Álmos
Héberger, Károly
Rácz, Anita
author_sort Orosz, Álmos
collection PubMed
description The screening of compounds for ADME-Tox targets plays an important role in drug design. QSPR models can increase the speed of these specific tasks, although the performance of the models highly depends on several factors, such as the applied molecular descriptors. In this study, a detailed comparison of the most popular descriptor groups has been carried out for six main ADME-Tox classification targets: Ames mutagenicity, P-glycoprotein inhibition, hERG inhibition, hepatotoxicity, blood–brain-barrier permeability, and cytochrome P450 2C9 inhibition. The literature-based, medium-sized binary classification datasets (all above 1,000 molecules) were used for the model building by two common algorithms, XGBoost and the RPropMLP neural network. Five molecular representation sets were compared along with their joint applications: Morgan, Atompairs, and MACCS fingerprints, and the traditional 1D and 2D molecular descriptors, as well as 3D molecular descriptors, separately. The statistical evaluation of the model performances was based on 18 different performance parameters. Although all the developed models were close to the usual performance of QSPR models for each specific ADME-Tox target, the results clearly showed the superiority of the traditional 1D, 2D, and 3D descriptors in the case of the XGBoost algorithm. It is worth trying the classical tools in single model building because the use of 2D descriptors can produce even better models for almost every dataset than the combination of all the examined descriptor sets.
format Online
Article
Text
id pubmed-9214226
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-92142262022-06-23 Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets Orosz, Álmos Héberger, Károly Rácz, Anita Front Chem Chemistry The screening of compounds for ADME-Tox targets plays an important role in drug design. QSPR models can increase the speed of these specific tasks, although the performance of the models highly depends on several factors, such as the applied molecular descriptors. In this study, a detailed comparison of the most popular descriptor groups has been carried out for six main ADME-Tox classification targets: Ames mutagenicity, P-glycoprotein inhibition, hERG inhibition, hepatotoxicity, blood–brain-barrier permeability, and cytochrome P450 2C9 inhibition. The literature-based, medium-sized binary classification datasets (all above 1,000 molecules) were used for the model building by two common algorithms, XGBoost and the RPropMLP neural network. Five molecular representation sets were compared along with their joint applications: Morgan, Atompairs, and MACCS fingerprints, and the traditional 1D and 2D molecular descriptors, as well as 3D molecular descriptors, separately. The statistical evaluation of the model performances was based on 18 different performance parameters. Although all the developed models were close to the usual performance of QSPR models for each specific ADME-Tox target, the results clearly showed the superiority of the traditional 1D, 2D, and 3D descriptors in the case of the XGBoost algorithm. It is worth trying the classical tools in single model building because the use of 2D descriptors can produce even better models for almost every dataset than the combination of all the examined descriptor sets. Frontiers Media S.A. 2022-06-08 /pmc/articles/PMC9214226/ /pubmed/35755260 http://dx.doi.org/10.3389/fchem.2022.852893 Text en Copyright © 2022 Orosz, Héberger and Rácz. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Chemistry
Orosz, Álmos
Héberger, Károly
Rácz, Anita
Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets
title Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets
title_full Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets
title_fullStr Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets
title_full_unstemmed Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets
title_short Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets
title_sort comparison of descriptor- and fingerprint sets in machine learning models for adme-tox targets
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9214226/
https://www.ncbi.nlm.nih.gov/pubmed/35755260
http://dx.doi.org/10.3389/fchem.2022.852893
work_keys_str_mv AT oroszalmos comparisonofdescriptorandfingerprintsetsinmachinelearningmodelsforadmetoxtargets
AT hebergerkaroly comparisonofdescriptorandfingerprintsetsinmachinelearningmodelsforadmetoxtargets
AT raczanita comparisonofdescriptorandfingerprintsetsinmachinelearningmodelsforadmetoxtargets