Cargando…
Random forest classification for predicting lifespan-extending chemical compounds
Ageing is a major risk factor for many conditions including cancer, cardiovascular and neurodegenerative diseases. Pharmaceutical interventions that slow down ageing and delay the onset of age-related diseases are a growing research area. The aim of this study was to build a machine learning model b...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8257600/ https://www.ncbi.nlm.nih.gov/pubmed/34226569 http://dx.doi.org/10.1038/s41598-021-93070-6 |
_version_ | 1783718346350395392 |
---|---|
author | Kapsiani, Sofia Howlin, Brendan J. |
author_facet | Kapsiani, Sofia Howlin, Brendan J. |
author_sort | Kapsiani, Sofia |
collection | PubMed |
description | Ageing is a major risk factor for many conditions including cancer, cardiovascular and neurodegenerative diseases. Pharmaceutical interventions that slow down ageing and delay the onset of age-related diseases are a growing research area. The aim of this study was to build a machine learning model based on the data of the DrugAge database to predict whether a chemical compound will extend the lifespan of Caenorhabditis elegans. Five predictive models were built using the random forest algorithm with molecular fingerprints and/or molecular descriptors as features. The best performing classifier, built using molecular descriptors, achieved an area under the curve score (AUC) of 0.815 for classifying the compounds in the test set. The features of the model were ranked using the Gini importance measure of the random forest algorithm. The top 30 features included descriptors related to atom and bond counts, topological and partial charge properties. The model was applied to predict the class of compounds in an external database, consisting of 1738 small-molecules. The chemical compounds of the screening database with a predictive probability of ≥ 0.80 for increasing the lifespan of Caenorhabditis elegans were broadly separated into (1) flavonoids, (2) fatty acids and conjugates, and (3) organooxygen compounds. |
format | Online Article Text |
id | pubmed-8257600 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-82576002021-07-06 Random forest classification for predicting lifespan-extending chemical compounds Kapsiani, Sofia Howlin, Brendan J. Sci Rep Article Ageing is a major risk factor for many conditions including cancer, cardiovascular and neurodegenerative diseases. Pharmaceutical interventions that slow down ageing and delay the onset of age-related diseases are a growing research area. The aim of this study was to build a machine learning model based on the data of the DrugAge database to predict whether a chemical compound will extend the lifespan of Caenorhabditis elegans. Five predictive models were built using the random forest algorithm with molecular fingerprints and/or molecular descriptors as features. The best performing classifier, built using molecular descriptors, achieved an area under the curve score (AUC) of 0.815 for classifying the compounds in the test set. The features of the model were ranked using the Gini importance measure of the random forest algorithm. The top 30 features included descriptors related to atom and bond counts, topological and partial charge properties. The model was applied to predict the class of compounds in an external database, consisting of 1738 small-molecules. The chemical compounds of the screening database with a predictive probability of ≥ 0.80 for increasing the lifespan of Caenorhabditis elegans were broadly separated into (1) flavonoids, (2) fatty acids and conjugates, and (3) organooxygen compounds. Nature Publishing Group UK 2021-07-05 /pmc/articles/PMC8257600/ /pubmed/34226569 http://dx.doi.org/10.1038/s41598-021-93070-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Kapsiani, Sofia Howlin, Brendan J. Random forest classification for predicting lifespan-extending chemical compounds |
title | Random forest classification for predicting lifespan-extending chemical compounds |
title_full | Random forest classification for predicting lifespan-extending chemical compounds |
title_fullStr | Random forest classification for predicting lifespan-extending chemical compounds |
title_full_unstemmed | Random forest classification for predicting lifespan-extending chemical compounds |
title_short | Random forest classification for predicting lifespan-extending chemical compounds |
title_sort | random forest classification for predicting lifespan-extending chemical compounds |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8257600/ https://www.ncbi.nlm.nih.gov/pubmed/34226569 http://dx.doi.org/10.1038/s41598-021-93070-6 |
work_keys_str_mv | AT kapsianisofia randomforestclassificationforpredictinglifespanextendingchemicalcompounds AT howlinbrendanj randomforestclassificationforpredictinglifespanextendingchemicalcompounds |