Cargando…

Novel feature selection methods for construction of accurate epigenetic clocks

Epigenetic clocks allow us to accurately predict the age and future health of individuals based on the methylation status of specific CpG sites in the genome and are a powerful tool to measure the effectiveness of longevity interventions. There is a growing need for methods to efficiently construct...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Adam, Mueller, Amber, English, Brad, Arena, Anthony, Vera, Daniel, Kane, Alice E., Sinclair, David A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9432708/
https://www.ncbi.nlm.nih.gov/pubmed/35984867
http://dx.doi.org/10.1371/journal.pcbi.1009938
_version_ 1784780445321265152
author Li, Adam
Mueller, Amber
English, Brad
Arena, Anthony
Vera, Daniel
Kane, Alice E.
Sinclair, David A.
author_facet Li, Adam
Mueller, Amber
English, Brad
Arena, Anthony
Vera, Daniel
Kane, Alice E.
Sinclair, David A.
author_sort Li, Adam
collection PubMed
description Epigenetic clocks allow us to accurately predict the age and future health of individuals based on the methylation status of specific CpG sites in the genome and are a powerful tool to measure the effectiveness of longevity interventions. There is a growing need for methods to efficiently construct epigenetic clocks. The most common approach is to create clocks using elastic net regression modelling of all measured CpG sites, without first identifying specific features or CpGs of interest. The addition of feature selection approaches provides the opportunity to optimise the identification of predictive CpG sites. Here, we apply novel feature selection methods and combinatorial approaches including newly adapted neural networks, genetic algorithms, and ‘chained’ combinations. Human whole blood methylation data of ~470,000 CpGs was used to develop clocks that predict age with R2 correlation scores of greater than 0.73, the most predictive of which uses 35 CpG sites for a R2 correlation score of 0.87. The five most frequent sites across all clocks were modelled to build a clock with a R2 correlation score of 0.83. These two clocks are validated on two external datasets where they maintain excellent predictive accuracy. When compared with three published epigenetic clocks (Hannum, Horvath, Weidner) also applied to these validation datasets, our clocks outperformed all three models. We identified gene regulatory regions associated with selected CpGs as possible targets for future aging studies. Thus, our feature selection algorithms build accurate, generalizable clocks with a low number of CpG sites, providing important tools for the field.
format Online
Article
Text
id pubmed-9432708
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-94327082022-09-01 Novel feature selection methods for construction of accurate epigenetic clocks Li, Adam Mueller, Amber English, Brad Arena, Anthony Vera, Daniel Kane, Alice E. Sinclair, David A. PLoS Comput Biol Research Article Epigenetic clocks allow us to accurately predict the age and future health of individuals based on the methylation status of specific CpG sites in the genome and are a powerful tool to measure the effectiveness of longevity interventions. There is a growing need for methods to efficiently construct epigenetic clocks. The most common approach is to create clocks using elastic net regression modelling of all measured CpG sites, without first identifying specific features or CpGs of interest. The addition of feature selection approaches provides the opportunity to optimise the identification of predictive CpG sites. Here, we apply novel feature selection methods and combinatorial approaches including newly adapted neural networks, genetic algorithms, and ‘chained’ combinations. Human whole blood methylation data of ~470,000 CpGs was used to develop clocks that predict age with R2 correlation scores of greater than 0.73, the most predictive of which uses 35 CpG sites for a R2 correlation score of 0.87. The five most frequent sites across all clocks were modelled to build a clock with a R2 correlation score of 0.83. These two clocks are validated on two external datasets where they maintain excellent predictive accuracy. When compared with three published epigenetic clocks (Hannum, Horvath, Weidner) also applied to these validation datasets, our clocks outperformed all three models. We identified gene regulatory regions associated with selected CpGs as possible targets for future aging studies. Thus, our feature selection algorithms build accurate, generalizable clocks with a low number of CpG sites, providing important tools for the field. Public Library of Science 2022-08-19 /pmc/articles/PMC9432708/ /pubmed/35984867 http://dx.doi.org/10.1371/journal.pcbi.1009938 Text en © 2022 Li et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Li, Adam
Mueller, Amber
English, Brad
Arena, Anthony
Vera, Daniel
Kane, Alice E.
Sinclair, David A.
Novel feature selection methods for construction of accurate epigenetic clocks
title Novel feature selection methods for construction of accurate epigenetic clocks
title_full Novel feature selection methods for construction of accurate epigenetic clocks
title_fullStr Novel feature selection methods for construction of accurate epigenetic clocks
title_full_unstemmed Novel feature selection methods for construction of accurate epigenetic clocks
title_short Novel feature selection methods for construction of accurate epigenetic clocks
title_sort novel feature selection methods for construction of accurate epigenetic clocks
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9432708/
https://www.ncbi.nlm.nih.gov/pubmed/35984867
http://dx.doi.org/10.1371/journal.pcbi.1009938
work_keys_str_mv AT liadam novelfeatureselectionmethodsforconstructionofaccurateepigeneticclocks
AT muelleramber novelfeatureselectionmethodsforconstructionofaccurateepigeneticclocks
AT englishbrad novelfeatureselectionmethodsforconstructionofaccurateepigeneticclocks
AT arenaanthony novelfeatureselectionmethodsforconstructionofaccurateepigeneticclocks
AT veradaniel novelfeatureselectionmethodsforconstructionofaccurateepigeneticclocks
AT kanealicee novelfeatureselectionmethodsforconstructionofaccurateepigeneticclocks
AT sinclairdavida novelfeatureselectionmethodsforconstructionofaccurateepigeneticclocks