Cargando…

Identifying longevity associated genes by integrating gene expression and curated annotations

Aging is a complex process with poorly understood genetic mechanisms. Recent studies have sought to classify genes as pro-longevity or anti-longevity using a variety of machine learning algorithms. However, it is not clear which types of features are best for optimizing classification performance an...

Descripción completa

Detalles Bibliográficos
Autores principales: Townes, F. William, Carr, Kareem, Miller, Jeffrey W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7728194/
https://www.ncbi.nlm.nih.gov/pubmed/33253142
http://dx.doi.org/10.1371/journal.pcbi.1008429
_version_ 1783621221483544576
author Townes, F. William
Carr, Kareem
Miller, Jeffrey W.
author_facet Townes, F. William
Carr, Kareem
Miller, Jeffrey W.
author_sort Townes, F. William
collection PubMed
description Aging is a complex process with poorly understood genetic mechanisms. Recent studies have sought to classify genes as pro-longevity or anti-longevity using a variety of machine learning algorithms. However, it is not clear which types of features are best for optimizing classification performance and which algorithms are best suited to this task. Further, performance assessments based on held-out test data are lacking. We systematically compare five popular classification algorithms using gene ontology and gene expression datasets as features to predict the pro-longevity versus anti-longevity status of genes for two model organisms (C. elegans and S. cerevisiae) using the GenAge database as ground truth. We find that elastic net penalized logistic regression performs particularly well at this task. Using elastic net, we make novel predictions of pro- and anti-longevity genes that are not currently in the GenAge database.
format Online
Article
Text
id pubmed-7728194
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-77281942020-12-16 Identifying longevity associated genes by integrating gene expression and curated annotations Townes, F. William Carr, Kareem Miller, Jeffrey W. PLoS Comput Biol Research Article Aging is a complex process with poorly understood genetic mechanisms. Recent studies have sought to classify genes as pro-longevity or anti-longevity using a variety of machine learning algorithms. However, it is not clear which types of features are best for optimizing classification performance and which algorithms are best suited to this task. Further, performance assessments based on held-out test data are lacking. We systematically compare five popular classification algorithms using gene ontology and gene expression datasets as features to predict the pro-longevity versus anti-longevity status of genes for two model organisms (C. elegans and S. cerevisiae) using the GenAge database as ground truth. We find that elastic net penalized logistic regression performs particularly well at this task. Using elastic net, we make novel predictions of pro- and anti-longevity genes that are not currently in the GenAge database. Public Library of Science 2020-11-30 /pmc/articles/PMC7728194/ /pubmed/33253142 http://dx.doi.org/10.1371/journal.pcbi.1008429 Text en © 2020 Townes et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Townes, F. William
Carr, Kareem
Miller, Jeffrey W.
Identifying longevity associated genes by integrating gene expression and curated annotations
title Identifying longevity associated genes by integrating gene expression and curated annotations
title_full Identifying longevity associated genes by integrating gene expression and curated annotations
title_fullStr Identifying longevity associated genes by integrating gene expression and curated annotations
title_full_unstemmed Identifying longevity associated genes by integrating gene expression and curated annotations
title_short Identifying longevity associated genes by integrating gene expression and curated annotations
title_sort identifying longevity associated genes by integrating gene expression and curated annotations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7728194/
https://www.ncbi.nlm.nih.gov/pubmed/33253142
http://dx.doi.org/10.1371/journal.pcbi.1008429
work_keys_str_mv AT townesfwilliam identifyinglongevityassociatedgenesbyintegratinggeneexpressionandcuratedannotations
AT carrkareem identifyinglongevityassociatedgenesbyintegratinggeneexpressionandcuratedannotations
AT millerjeffreyw identifyinglongevityassociatedgenesbyintegratinggeneexpressionandcuratedannotations