Cargando…

Calibrating epigenetic clocks with training data error

Animal age data are valuable for management of wildlife populations. Yet, for most species, there is no practical method for determining the age of unknown individuals. However, epigenetic clocks, a molecular‐based method, are capable of age prediction by sampling specific tissue types and measuring...

Descripción completa

Detalles Bibliográficos
Autores principales: Mayne, Benjamin, Berry, Oliver, Jarman, Simon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10445086/
https://www.ncbi.nlm.nih.gov/pubmed/37622096
http://dx.doi.org/10.1111/eva.13582
_version_ 1785094099250970624
author Mayne, Benjamin
Berry, Oliver
Jarman, Simon
author_facet Mayne, Benjamin
Berry, Oliver
Jarman, Simon
author_sort Mayne, Benjamin
collection PubMed
description Animal age data are valuable for management of wildlife populations. Yet, for most species, there is no practical method for determining the age of unknown individuals. However, epigenetic clocks, a molecular‐based method, are capable of age prediction by sampling specific tissue types and measuring DNA methylation levels at specific loci. Developing an epigenetic clock requires a large number of samples from animals of known ages. For most species, there are no individuals whose exact ages are known, making epigenetic clock calibration inaccurate or impossible. For many epigenetic clocks, calibration samples with inaccurate age estimates introduce a degree of error to epigenetic clock calibration. In this study, we investigated how much error in the training data set of an epigenetic clock can be tolerated before it resulted in an unacceptable increase in error for age prediction. Using four publicly available data sets, we artificially increased the training data age error by iterations of 1% and then tested the model against an independent set of known ages. A small effect size increase (Cohen's d >0.2) was detected when the error in age was higher than 22%. The effect size increased linearly with age error. This threshold was independent of sample size. Downstream applications for age data may have a more important role in deciding how much error can be tolerated for age prediction. If highly precise age estimates are required, then it may be futile to embark on the development of an epigenetic clock when there is no accurately aged calibration population to work with. However, for other problems, such as determining the relative age order of pairs of individuals, a lower‐quality calibration data set may be adequate.
format Online
Article
Text
id pubmed-10445086
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-104450862023-08-24 Calibrating epigenetic clocks with training data error Mayne, Benjamin Berry, Oliver Jarman, Simon Evol Appl Original Articles Animal age data are valuable for management of wildlife populations. Yet, for most species, there is no practical method for determining the age of unknown individuals. However, epigenetic clocks, a molecular‐based method, are capable of age prediction by sampling specific tissue types and measuring DNA methylation levels at specific loci. Developing an epigenetic clock requires a large number of samples from animals of known ages. For most species, there are no individuals whose exact ages are known, making epigenetic clock calibration inaccurate or impossible. For many epigenetic clocks, calibration samples with inaccurate age estimates introduce a degree of error to epigenetic clock calibration. In this study, we investigated how much error in the training data set of an epigenetic clock can be tolerated before it resulted in an unacceptable increase in error for age prediction. Using four publicly available data sets, we artificially increased the training data age error by iterations of 1% and then tested the model against an independent set of known ages. A small effect size increase (Cohen's d >0.2) was detected when the error in age was higher than 22%. The effect size increased linearly with age error. This threshold was independent of sample size. Downstream applications for age data may have a more important role in deciding how much error can be tolerated for age prediction. If highly precise age estimates are required, then it may be futile to embark on the development of an epigenetic clock when there is no accurately aged calibration population to work with. However, for other problems, such as determining the relative age order of pairs of individuals, a lower‐quality calibration data set may be adequate. John Wiley and Sons Inc. 2023-07-26 /pmc/articles/PMC10445086/ /pubmed/37622096 http://dx.doi.org/10.1111/eva.13582 Text en © 2023 The Authors. Evolutionary Applications published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Articles
Mayne, Benjamin
Berry, Oliver
Jarman, Simon
Calibrating epigenetic clocks with training data error
title Calibrating epigenetic clocks with training data error
title_full Calibrating epigenetic clocks with training data error
title_fullStr Calibrating epigenetic clocks with training data error
title_full_unstemmed Calibrating epigenetic clocks with training data error
title_short Calibrating epigenetic clocks with training data error
title_sort calibrating epigenetic clocks with training data error
topic Original Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10445086/
https://www.ncbi.nlm.nih.gov/pubmed/37622096
http://dx.doi.org/10.1111/eva.13582
work_keys_str_mv AT maynebenjamin calibratingepigeneticclockswithtrainingdataerror
AT berryoliver calibratingepigeneticclockswithtrainingdataerror
AT jarmansimon calibratingepigeneticclockswithtrainingdataerror