Cargando…

DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing

The ability to estimate the age of the donor from recovered biological material at a crime scene can be of substantial value in forensic investigations. Aging can be complex and is associated with various molecular modifications in cells that accumulate over a person’s lifetime including epigenetic...

Descripción completa

Detalles Bibliográficos
Autores principales: Vidaki, Athina, Ballard, David, Aliferi, Anastasia, Miller, Thomas H., Barron, Leon P., Syndercombe Court, Denise
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5392537/
https://www.ncbi.nlm.nih.gov/pubmed/28254385
http://dx.doi.org/10.1016/j.fsigen.2017.02.009
_version_ 1783229467712290816
author Vidaki, Athina
Ballard, David
Aliferi, Anastasia
Miller, Thomas H.
Barron, Leon P.
Syndercombe Court, Denise
author_facet Vidaki, Athina
Ballard, David
Aliferi, Anastasia
Miller, Thomas H.
Barron, Leon P.
Syndercombe Court, Denise
author_sort Vidaki, Athina
collection PubMed
description The ability to estimate the age of the donor from recovered biological material at a crime scene can be of substantial value in forensic investigations. Aging can be complex and is associated with various molecular modifications in cells that accumulate over a person’s lifetime including epigenetic patterns. The aim of this study was to use age-specific DNA methylation patterns to generate an accurate model for the prediction of chronological age using data from whole blood. In total, 45 age-associated CpG sites were selected based on their reported age coefficients in a previous extensive study and investigated using publicly available methylation data obtained from 1156 whole blood samples (aged 2–90 years) analysed with Illumina’s genome-wide methylation platforms (27 K/450 K). Applying stepwise regression for variable selection, 23 of these CpG sites were identified that could significantly contribute to age prediction modelling and multiple regression analysis carried out with these markers provided an accurate prediction of age (R(2) = 0.92, mean absolute error (MAE) = 4.6 years). However, applying machine learning, and more specifically a generalised regression neural network model, the age prediction significantly improved (R(2) = 0.96) with a MAE = 3.3 years for the training set and 4.4 years for a blind test set of 231 cases. The machine learning approach used 16 CpG sites, located in 16 different genomic regions, with the top 3 predictors of age belonged to the genes NHLRC1, SCGN and CSNK1D. The proposed model was further tested using independent cohorts of 53 monozygotic twins (MAE = 7.1 years) and a cohort of 1011 disease state individuals (MAE = 7.2 years). Furthermore, we highlighted the age markers’ potential applicability in samples other than blood by predicting age with similar accuracy in 265 saliva samples (R(2) = 0.96) with a MAE = 3.2 years (training set) and 4.0 years (blind test). In an attempt to create a sensitive and accurate age prediction test, a next generation sequencing (NGS)-based method able to quantify the methylation status of the selected 16 CpG sites was developed using the Illumina MiSeq(®) platform. The method was validated using DNA standards of known methylation levels and the age prediction accuracy has been initially assessed in a set of 46 whole blood samples. Although the resulted prediction accuracy using the NGS data was lower compared to the original model (MAE = 7.5 years), it is expected that future optimization of our strategy to account for technical variation as well as increasing the sample size will improve both the prediction accuracy and reproducibility.
format Online
Article
Text
id pubmed-5392537
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-53925372017-05-01 DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing Vidaki, Athina Ballard, David Aliferi, Anastasia Miller, Thomas H. Barron, Leon P. Syndercombe Court, Denise Forensic Sci Int Genet Research Paper The ability to estimate the age of the donor from recovered biological material at a crime scene can be of substantial value in forensic investigations. Aging can be complex and is associated with various molecular modifications in cells that accumulate over a person’s lifetime including epigenetic patterns. The aim of this study was to use age-specific DNA methylation patterns to generate an accurate model for the prediction of chronological age using data from whole blood. In total, 45 age-associated CpG sites were selected based on their reported age coefficients in a previous extensive study and investigated using publicly available methylation data obtained from 1156 whole blood samples (aged 2–90 years) analysed with Illumina’s genome-wide methylation platforms (27 K/450 K). Applying stepwise regression for variable selection, 23 of these CpG sites were identified that could significantly contribute to age prediction modelling and multiple regression analysis carried out with these markers provided an accurate prediction of age (R(2) = 0.92, mean absolute error (MAE) = 4.6 years). However, applying machine learning, and more specifically a generalised regression neural network model, the age prediction significantly improved (R(2) = 0.96) with a MAE = 3.3 years for the training set and 4.4 years for a blind test set of 231 cases. The machine learning approach used 16 CpG sites, located in 16 different genomic regions, with the top 3 predictors of age belonged to the genes NHLRC1, SCGN and CSNK1D. The proposed model was further tested using independent cohorts of 53 monozygotic twins (MAE = 7.1 years) and a cohort of 1011 disease state individuals (MAE = 7.2 years). Furthermore, we highlighted the age markers’ potential applicability in samples other than blood by predicting age with similar accuracy in 265 saliva samples (R(2) = 0.96) with a MAE = 3.2 years (training set) and 4.0 years (blind test). In an attempt to create a sensitive and accurate age prediction test, a next generation sequencing (NGS)-based method able to quantify the methylation status of the selected 16 CpG sites was developed using the Illumina MiSeq(®) platform. The method was validated using DNA standards of known methylation levels and the age prediction accuracy has been initially assessed in a set of 46 whole blood samples. Although the resulted prediction accuracy using the NGS data was lower compared to the original model (MAE = 7.5 years), it is expected that future optimization of our strategy to account for technical variation as well as increasing the sample size will improve both the prediction accuracy and reproducibility. Elsevier 2017-05 /pmc/articles/PMC5392537/ /pubmed/28254385 http://dx.doi.org/10.1016/j.fsigen.2017.02.009 Text en © 2017 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Research Paper
Vidaki, Athina
Ballard, David
Aliferi, Anastasia
Miller, Thomas H.
Barron, Leon P.
Syndercombe Court, Denise
DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing
title DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing
title_full DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing
title_fullStr DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing
title_full_unstemmed DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing
title_short DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing
title_sort dna methylation-based forensic age prediction using artificial neural networks and next generation sequencing
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5392537/
https://www.ncbi.nlm.nih.gov/pubmed/28254385
http://dx.doi.org/10.1016/j.fsigen.2017.02.009
work_keys_str_mv AT vidakiathina dnamethylationbasedforensicagepredictionusingartificialneuralnetworksandnextgenerationsequencing
AT ballarddavid dnamethylationbasedforensicagepredictionusingartificialneuralnetworksandnextgenerationsequencing
AT aliferianastasia dnamethylationbasedforensicagepredictionusingartificialneuralnetworksandnextgenerationsequencing
AT millerthomash dnamethylationbasedforensicagepredictionusingartificialneuralnetworksandnextgenerationsequencing
AT barronleonp dnamethylationbasedforensicagepredictionusingartificialneuralnetworksandnextgenerationsequencing
AT syndercombecourtdenise dnamethylationbasedforensicagepredictionusingartificialneuralnetworksandnextgenerationsequencing