Cargando…
Identifying longevity associated genes by integrating gene expression and curated annotations
Aging is a complex process with poorly understood genetic mechanisms. Recent studies have sought to classify genes as pro-longevity or anti-longevity using a variety of machine learning algorithms. However, it is not clear which types of features are best for optimizing classification performance an...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7728194/ https://www.ncbi.nlm.nih.gov/pubmed/33253142 http://dx.doi.org/10.1371/journal.pcbi.1008429 |
_version_ | 1783621221483544576 |
---|---|
author | Townes, F. William Carr, Kareem Miller, Jeffrey W. |
author_facet | Townes, F. William Carr, Kareem Miller, Jeffrey W. |
author_sort | Townes, F. William |
collection | PubMed |
description | Aging is a complex process with poorly understood genetic mechanisms. Recent studies have sought to classify genes as pro-longevity or anti-longevity using a variety of machine learning algorithms. However, it is not clear which types of features are best for optimizing classification performance and which algorithms are best suited to this task. Further, performance assessments based on held-out test data are lacking. We systematically compare five popular classification algorithms using gene ontology and gene expression datasets as features to predict the pro-longevity versus anti-longevity status of genes for two model organisms (C. elegans and S. cerevisiae) using the GenAge database as ground truth. We find that elastic net penalized logistic regression performs particularly well at this task. Using elastic net, we make novel predictions of pro- and anti-longevity genes that are not currently in the GenAge database. |
format | Online Article Text |
id | pubmed-7728194 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-77281942020-12-16 Identifying longevity associated genes by integrating gene expression and curated annotations Townes, F. William Carr, Kareem Miller, Jeffrey W. PLoS Comput Biol Research Article Aging is a complex process with poorly understood genetic mechanisms. Recent studies have sought to classify genes as pro-longevity or anti-longevity using a variety of machine learning algorithms. However, it is not clear which types of features are best for optimizing classification performance and which algorithms are best suited to this task. Further, performance assessments based on held-out test data are lacking. We systematically compare five popular classification algorithms using gene ontology and gene expression datasets as features to predict the pro-longevity versus anti-longevity status of genes for two model organisms (C. elegans and S. cerevisiae) using the GenAge database as ground truth. We find that elastic net penalized logistic regression performs particularly well at this task. Using elastic net, we make novel predictions of pro- and anti-longevity genes that are not currently in the GenAge database. Public Library of Science 2020-11-30 /pmc/articles/PMC7728194/ /pubmed/33253142 http://dx.doi.org/10.1371/journal.pcbi.1008429 Text en © 2020 Townes et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Townes, F. William Carr, Kareem Miller, Jeffrey W. Identifying longevity associated genes by integrating gene expression and curated annotations |
title | Identifying longevity associated genes by integrating gene expression and curated annotations |
title_full | Identifying longevity associated genes by integrating gene expression and curated annotations |
title_fullStr | Identifying longevity associated genes by integrating gene expression and curated annotations |
title_full_unstemmed | Identifying longevity associated genes by integrating gene expression and curated annotations |
title_short | Identifying longevity associated genes by integrating gene expression and curated annotations |
title_sort | identifying longevity associated genes by integrating gene expression and curated annotations |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7728194/ https://www.ncbi.nlm.nih.gov/pubmed/33253142 http://dx.doi.org/10.1371/journal.pcbi.1008429 |
work_keys_str_mv | AT townesfwilliam identifyinglongevityassociatedgenesbyintegratinggeneexpressionandcuratedannotations AT carrkareem identifyinglongevityassociatedgenesbyintegratinggeneexpressionandcuratedannotations AT millerjeffreyw identifyinglongevityassociatedgenesbyintegratinggeneexpressionandcuratedannotations |