Cargando…

Chronological Age Prediction: Developmental Evaluation of DNA Methylation-Based Machine Learning Models

Epigenetic clock, a highly accurate age estimator based on DNA methylation (DNAm) level, is the basis for predicting mortality/morbidity and elucidating the molecular mechanism of aging, which is of great significance in forensics, justice, and social life. Herein, we integrated machine learning (ML...

Descripción completa

Detalles Bibliográficos
Autores principales: Fan, Haoliang, Xie, Qiqian, Zhang, Zheng, Wang, Junhao, Chen, Xuncai, Qiu, Pingming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8819006/
https://www.ncbi.nlm.nih.gov/pubmed/35141217
http://dx.doi.org/10.3389/fbioe.2021.819991
_version_ 1784645961178415104
author Fan, Haoliang
Xie, Qiqian
Zhang, Zheng
Wang, Junhao
Chen, Xuncai
Qiu, Pingming
author_facet Fan, Haoliang
Xie, Qiqian
Zhang, Zheng
Wang, Junhao
Chen, Xuncai
Qiu, Pingming
author_sort Fan, Haoliang
collection PubMed
description Epigenetic clock, a highly accurate age estimator based on DNA methylation (DNAm) level, is the basis for predicting mortality/morbidity and elucidating the molecular mechanism of aging, which is of great significance in forensics, justice, and social life. Herein, we integrated machine learning (ML) algorithms to construct blood epigenetic clock in Southern Han Chinese (CHS) for chronological age prediction. The correlation coefficient (r) meta-analyses of 7,084 individuals were firstly implemented to select five genes (ELOVL2, C1orf132, TRIM59, FHL2, and KLF14) from a candidate set of nine age-associated DNAm biomarkers. The DNAm-based profiles of the CHS cohort (240 blood samples differing in age from 1 to 81 years) were generated by the bisulfite targeted amplicon pyrosequencing (BTA-pseq) from 34 cytosine-phosphate-guanine sites (CpGs) of five selected genes, revealing that the methylation levels at different CpGs exhibit population specificity. Furthermore, we established and evaluated four chronological age prediction models using distinct ML algorithms: stepwise regression (SR), support vector regression (SVR-eps and SVR-nu), and random forest regression (RFR). The median absolute deviation (MAD) values increased with chronological age, especially in the 61–81 age category. No apparent gender effect was found in different ML models of the CHS cohort (all p > 0.05). The MAD values were 2.97, 2.22, 2.19, and 1.29 years for SR, SVR-eps, SVR-nu, and RFR in the CHS cohort, respectively. Eventually, compared to the MAD range of the meta cohort (2.53–5.07 years), a promising RFR model (ntree = 500 and mtry = 8) was optimized with an MAD of 1.15 years in the 1–60 age categories of the CHS cohort, which could be regarded as a robust epigenetic clock in blood for age-related issues.
format Online
Article
Text
id pubmed-8819006
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-88190062022-02-08 Chronological Age Prediction: Developmental Evaluation of DNA Methylation-Based Machine Learning Models Fan, Haoliang Xie, Qiqian Zhang, Zheng Wang, Junhao Chen, Xuncai Qiu, Pingming Front Bioeng Biotechnol Bioengineering and Biotechnology Epigenetic clock, a highly accurate age estimator based on DNA methylation (DNAm) level, is the basis for predicting mortality/morbidity and elucidating the molecular mechanism of aging, which is of great significance in forensics, justice, and social life. Herein, we integrated machine learning (ML) algorithms to construct blood epigenetic clock in Southern Han Chinese (CHS) for chronological age prediction. The correlation coefficient (r) meta-analyses of 7,084 individuals were firstly implemented to select five genes (ELOVL2, C1orf132, TRIM59, FHL2, and KLF14) from a candidate set of nine age-associated DNAm biomarkers. The DNAm-based profiles of the CHS cohort (240 blood samples differing in age from 1 to 81 years) were generated by the bisulfite targeted amplicon pyrosequencing (BTA-pseq) from 34 cytosine-phosphate-guanine sites (CpGs) of five selected genes, revealing that the methylation levels at different CpGs exhibit population specificity. Furthermore, we established and evaluated four chronological age prediction models using distinct ML algorithms: stepwise regression (SR), support vector regression (SVR-eps and SVR-nu), and random forest regression (RFR). The median absolute deviation (MAD) values increased with chronological age, especially in the 61–81 age category. No apparent gender effect was found in different ML models of the CHS cohort (all p > 0.05). The MAD values were 2.97, 2.22, 2.19, and 1.29 years for SR, SVR-eps, SVR-nu, and RFR in the CHS cohort, respectively. Eventually, compared to the MAD range of the meta cohort (2.53–5.07 years), a promising RFR model (ntree = 500 and mtry = 8) was optimized with an MAD of 1.15 years in the 1–60 age categories of the CHS cohort, which could be regarded as a robust epigenetic clock in blood for age-related issues. Frontiers Media S.A. 2022-01-24 /pmc/articles/PMC8819006/ /pubmed/35141217 http://dx.doi.org/10.3389/fbioe.2021.819991 Text en Copyright © 2022 Fan, Xie, Zhang, Wang, Chen and Qiu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioengineering and Biotechnology
Fan, Haoliang
Xie, Qiqian
Zhang, Zheng
Wang, Junhao
Chen, Xuncai
Qiu, Pingming
Chronological Age Prediction: Developmental Evaluation of DNA Methylation-Based Machine Learning Models
title Chronological Age Prediction: Developmental Evaluation of DNA Methylation-Based Machine Learning Models
title_full Chronological Age Prediction: Developmental Evaluation of DNA Methylation-Based Machine Learning Models
title_fullStr Chronological Age Prediction: Developmental Evaluation of DNA Methylation-Based Machine Learning Models
title_full_unstemmed Chronological Age Prediction: Developmental Evaluation of DNA Methylation-Based Machine Learning Models
title_short Chronological Age Prediction: Developmental Evaluation of DNA Methylation-Based Machine Learning Models
title_sort chronological age prediction: developmental evaluation of dna methylation-based machine learning models
topic Bioengineering and Biotechnology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8819006/
https://www.ncbi.nlm.nih.gov/pubmed/35141217
http://dx.doi.org/10.3389/fbioe.2021.819991
work_keys_str_mv AT fanhaoliang chronologicalagepredictiondevelopmentalevaluationofdnamethylationbasedmachinelearningmodels
AT xieqiqian chronologicalagepredictiondevelopmentalevaluationofdnamethylationbasedmachinelearningmodels
AT zhangzheng chronologicalagepredictiondevelopmentalevaluationofdnamethylationbasedmachinelearningmodels
AT wangjunhao chronologicalagepredictiondevelopmentalevaluationofdnamethylationbasedmachinelearningmodels
AT chenxuncai chronologicalagepredictiondevelopmentalevaluationofdnamethylationbasedmachinelearningmodels
AT qiupingming chronologicalagepredictiondevelopmentalevaluationofdnamethylationbasedmachinelearningmodels