Cargando…
Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets
The screening of compounds for ADME-Tox targets plays an important role in drug design. QSPR models can increase the speed of these specific tasks, although the performance of the models highly depends on several factors, such as the applied molecular descriptors. In this study, a detailed compariso...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9214226/ https://www.ncbi.nlm.nih.gov/pubmed/35755260 http://dx.doi.org/10.3389/fchem.2022.852893 |
_version_ | 1784730968282628096 |
---|---|
author | Orosz, Álmos Héberger, Károly Rácz, Anita |
author_facet | Orosz, Álmos Héberger, Károly Rácz, Anita |
author_sort | Orosz, Álmos |
collection | PubMed |
description | The screening of compounds for ADME-Tox targets plays an important role in drug design. QSPR models can increase the speed of these specific tasks, although the performance of the models highly depends on several factors, such as the applied molecular descriptors. In this study, a detailed comparison of the most popular descriptor groups has been carried out for six main ADME-Tox classification targets: Ames mutagenicity, P-glycoprotein inhibition, hERG inhibition, hepatotoxicity, blood–brain-barrier permeability, and cytochrome P450 2C9 inhibition. The literature-based, medium-sized binary classification datasets (all above 1,000 molecules) were used for the model building by two common algorithms, XGBoost and the RPropMLP neural network. Five molecular representation sets were compared along with their joint applications: Morgan, Atompairs, and MACCS fingerprints, and the traditional 1D and 2D molecular descriptors, as well as 3D molecular descriptors, separately. The statistical evaluation of the model performances was based on 18 different performance parameters. Although all the developed models were close to the usual performance of QSPR models for each specific ADME-Tox target, the results clearly showed the superiority of the traditional 1D, 2D, and 3D descriptors in the case of the XGBoost algorithm. It is worth trying the classical tools in single model building because the use of 2D descriptors can produce even better models for almost every dataset than the combination of all the examined descriptor sets. |
format | Online Article Text |
id | pubmed-9214226 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-92142262022-06-23 Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets Orosz, Álmos Héberger, Károly Rácz, Anita Front Chem Chemistry The screening of compounds for ADME-Tox targets plays an important role in drug design. QSPR models can increase the speed of these specific tasks, although the performance of the models highly depends on several factors, such as the applied molecular descriptors. In this study, a detailed comparison of the most popular descriptor groups has been carried out for six main ADME-Tox classification targets: Ames mutagenicity, P-glycoprotein inhibition, hERG inhibition, hepatotoxicity, blood–brain-barrier permeability, and cytochrome P450 2C9 inhibition. The literature-based, medium-sized binary classification datasets (all above 1,000 molecules) were used for the model building by two common algorithms, XGBoost and the RPropMLP neural network. Five molecular representation sets were compared along with their joint applications: Morgan, Atompairs, and MACCS fingerprints, and the traditional 1D and 2D molecular descriptors, as well as 3D molecular descriptors, separately. The statistical evaluation of the model performances was based on 18 different performance parameters. Although all the developed models were close to the usual performance of QSPR models for each specific ADME-Tox target, the results clearly showed the superiority of the traditional 1D, 2D, and 3D descriptors in the case of the XGBoost algorithm. It is worth trying the classical tools in single model building because the use of 2D descriptors can produce even better models for almost every dataset than the combination of all the examined descriptor sets. Frontiers Media S.A. 2022-06-08 /pmc/articles/PMC9214226/ /pubmed/35755260 http://dx.doi.org/10.3389/fchem.2022.852893 Text en Copyright © 2022 Orosz, Héberger and Rácz. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Chemistry Orosz, Álmos Héberger, Károly Rácz, Anita Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets |
title | Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets |
title_full | Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets |
title_fullStr | Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets |
title_full_unstemmed | Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets |
title_short | Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets |
title_sort | comparison of descriptor- and fingerprint sets in machine learning models for adme-tox targets |
topic | Chemistry |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9214226/ https://www.ncbi.nlm.nih.gov/pubmed/35755260 http://dx.doi.org/10.3389/fchem.2022.852893 |
work_keys_str_mv | AT oroszalmos comparisonofdescriptorandfingerprintsetsinmachinelearningmodelsforadmetoxtargets AT hebergerkaroly comparisonofdescriptorandfingerprintsetsinmachinelearningmodelsforadmetoxtargets AT raczanita comparisonofdescriptorandfingerprintsetsinmachinelearningmodelsforadmetoxtargets |