Cargando…

DNA Methylation Biomarkers-Based Human Age Prediction Using Machine Learning

PURPOSE: Age can be an important clue in uncovering the identity of persons that left biological evidence at crime scenes. With the availability of DNA methylation data, several age prediction models are developed by using statistical and machine learning methods. From epigenetic studies, it has bee...

Descripción completa

Detalles Bibliográficos
Autores principales: Zaguia, Atef, Pandey, Deepak, Painuly, Sandeep, Pal, Saurabh Kumar, Garg, Vivek Kumar, Goel, Neelam
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8803417/
https://www.ncbi.nlm.nih.gov/pubmed/35111213
http://dx.doi.org/10.1155/2022/8393498
_version_ 1784642863498264576
author Zaguia, Atef
Pandey, Deepak
Painuly, Sandeep
Pal, Saurabh Kumar
Garg, Vivek Kumar
Goel, Neelam
author_facet Zaguia, Atef
Pandey, Deepak
Painuly, Sandeep
Pal, Saurabh Kumar
Garg, Vivek Kumar
Goel, Neelam
author_sort Zaguia, Atef
collection PubMed
description PURPOSE: Age can be an important clue in uncovering the identity of persons that left biological evidence at crime scenes. With the availability of DNA methylation data, several age prediction models are developed by using statistical and machine learning methods. From epigenetic studies, it has been demonstrated that there is a close association between aging and DNA methylation. Most of the existing studies focused on healthy samples, whereas diseases may have a significant impact on human age. Therefore, in this article, an age prediction model is proposed using DNA methylation biomarkers for healthy and diseased samples. METHODS: The dataset contains 454 healthy samples and 400 diseased samples from publicly available sources with age (1–89 years old). Six CpG sites are identified from this data having a high correlation with age using Pearson's correlation coefficient. In this work, the age prediction model is developed using four different machine learning techniques, namely, Multiple Linear Regression, Support Vector Regression, Gradient Boosting Regression, and Random Forest Regression. Separate models are designed for healthy and diseased data. The data are split randomly into 80 : 20 ratios for training and testing, respectively. RESULTS: Among all the techniques, the model designed using Random Forest Regression shows the best performance, and Gradient Boosting Regression is the second best model. In the case of healthy samples, the model achieved a MAD of 2.51 years for training data and 4.85 for testing data. Also, for diseased samples, a MAD of 3.83 years is obtained for training and 9.53 years for testing. CONCLUSION: These results showed that the proposed model can predict age for healthy and diseased samples.
format Online
Article
Text
id pubmed-8803417
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-88034172022-02-01 DNA Methylation Biomarkers-Based Human Age Prediction Using Machine Learning Zaguia, Atef Pandey, Deepak Painuly, Sandeep Pal, Saurabh Kumar Garg, Vivek Kumar Goel, Neelam Comput Intell Neurosci Research Article PURPOSE: Age can be an important clue in uncovering the identity of persons that left biological evidence at crime scenes. With the availability of DNA methylation data, several age prediction models are developed by using statistical and machine learning methods. From epigenetic studies, it has been demonstrated that there is a close association between aging and DNA methylation. Most of the existing studies focused on healthy samples, whereas diseases may have a significant impact on human age. Therefore, in this article, an age prediction model is proposed using DNA methylation biomarkers for healthy and diseased samples. METHODS: The dataset contains 454 healthy samples and 400 diseased samples from publicly available sources with age (1–89 years old). Six CpG sites are identified from this data having a high correlation with age using Pearson's correlation coefficient. In this work, the age prediction model is developed using four different machine learning techniques, namely, Multiple Linear Regression, Support Vector Regression, Gradient Boosting Regression, and Random Forest Regression. Separate models are designed for healthy and diseased data. The data are split randomly into 80 : 20 ratios for training and testing, respectively. RESULTS: Among all the techniques, the model designed using Random Forest Regression shows the best performance, and Gradient Boosting Regression is the second best model. In the case of healthy samples, the model achieved a MAD of 2.51 years for training data and 4.85 for testing data. Also, for diseased samples, a MAD of 3.83 years is obtained for training and 9.53 years for testing. CONCLUSION: These results showed that the proposed model can predict age for healthy and diseased samples. Hindawi 2022-01-24 /pmc/articles/PMC8803417/ /pubmed/35111213 http://dx.doi.org/10.1155/2022/8393498 Text en Copyright © 2022 Atef Zaguia et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zaguia, Atef
Pandey, Deepak
Painuly, Sandeep
Pal, Saurabh Kumar
Garg, Vivek Kumar
Goel, Neelam
DNA Methylation Biomarkers-Based Human Age Prediction Using Machine Learning
title DNA Methylation Biomarkers-Based Human Age Prediction Using Machine Learning
title_full DNA Methylation Biomarkers-Based Human Age Prediction Using Machine Learning
title_fullStr DNA Methylation Biomarkers-Based Human Age Prediction Using Machine Learning
title_full_unstemmed DNA Methylation Biomarkers-Based Human Age Prediction Using Machine Learning
title_short DNA Methylation Biomarkers-Based Human Age Prediction Using Machine Learning
title_sort dna methylation biomarkers-based human age prediction using machine learning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8803417/
https://www.ncbi.nlm.nih.gov/pubmed/35111213
http://dx.doi.org/10.1155/2022/8393498
work_keys_str_mv AT zaguiaatef dnamethylationbiomarkersbasedhumanagepredictionusingmachinelearning
AT pandeydeepak dnamethylationbiomarkersbasedhumanagepredictionusingmachinelearning
AT painulysandeep dnamethylationbiomarkersbasedhumanagepredictionusingmachinelearning
AT palsaurabhkumar dnamethylationbiomarkersbasedhumanagepredictionusingmachinelearning
AT gargvivekkumar dnamethylationbiomarkersbasedhumanagepredictionusingmachinelearning
AT goelneelam dnamethylationbiomarkersbasedhumanagepredictionusingmachinelearning