Cargando…

Epigenetic pacemaker: closed form algebraic solutions

BACKGROUND: DNA methylation is widely used as a biomarker in crucial medical applications as well as for human age prediction of very high accuracy. This biomarker is based on the methylation status of several hundred CpG sites. In a recent line of publications we have adapted a versatile concept fr...

Descripción completa

Detalles Bibliográficos
Autor principal: Snir, Sagi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7161103/
https://www.ncbi.nlm.nih.gov/pubmed/32299339
http://dx.doi.org/10.1186/s12864-020-6606-0
_version_ 1783522889464545280
author Snir, Sagi
author_facet Snir, Sagi
author_sort Snir, Sagi
collection PubMed
description BACKGROUND: DNA methylation is widely used as a biomarker in crucial medical applications as well as for human age prediction of very high accuracy. This biomarker is based on the methylation status of several hundred CpG sites. In a recent line of publications we have adapted a versatile concept from evolutionary biology - the Universal Pacemaker (UPM) - to the setting of epigenetic aging and denoted it the Epigenetic PaceMaker (EPM). The EPM, as opposed to other epigenetic clocks, is not confined to specific pattern of aging, and the epigenetic age of the individual is inferred independently of other individuals. This allows an explicit modeling of aging trends, in particular non linear relationship between chronological and epigenetic age. In one of these recent works, we have presented an algorithmic improvement based on a two-step conditional expectation maximization (CEM) algorithm to arrive at a critical point on the likelihood surface. The algorithm alternates between a time step and a site step while advancing on the likelihood surface. RESULTS: Here we introduce non trivial improvements to these steps that are essential for analyzing data sets of realistic magnitude in a manageable time and space. These structural improvements are based on insights from linear algebra and symbolic algebra tools, providing us greater understanding of the degeneracy of the complex problem space. This understanding in turn, leads to the complete elimination of the bottleneck of cumbersome matrix multiplication and inversion, yielding a fast closed form solution in both steps of the CEM.In the experimental results part, we compare the CEM algorithm over several data sets and demonstrate the speedup obtained by the closed form solutions. Our results support the theoretical analysis of this improvement. CONCLUSIONS: These improvements enable us to increase substantially the scale of inputs analyzed by the method, allowing us to apply the new approach to data sets that could not be analyzed before.
format Online
Article
Text
id pubmed-7161103
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-71611032020-04-22 Epigenetic pacemaker: closed form algebraic solutions Snir, Sagi BMC Genomics Research BACKGROUND: DNA methylation is widely used as a biomarker in crucial medical applications as well as for human age prediction of very high accuracy. This biomarker is based on the methylation status of several hundred CpG sites. In a recent line of publications we have adapted a versatile concept from evolutionary biology - the Universal Pacemaker (UPM) - to the setting of epigenetic aging and denoted it the Epigenetic PaceMaker (EPM). The EPM, as opposed to other epigenetic clocks, is not confined to specific pattern of aging, and the epigenetic age of the individual is inferred independently of other individuals. This allows an explicit modeling of aging trends, in particular non linear relationship between chronological and epigenetic age. In one of these recent works, we have presented an algorithmic improvement based on a two-step conditional expectation maximization (CEM) algorithm to arrive at a critical point on the likelihood surface. The algorithm alternates between a time step and a site step while advancing on the likelihood surface. RESULTS: Here we introduce non trivial improvements to these steps that are essential for analyzing data sets of realistic magnitude in a manageable time and space. These structural improvements are based on insights from linear algebra and symbolic algebra tools, providing us greater understanding of the degeneracy of the complex problem space. This understanding in turn, leads to the complete elimination of the bottleneck of cumbersome matrix multiplication and inversion, yielding a fast closed form solution in both steps of the CEM.In the experimental results part, we compare the CEM algorithm over several data sets and demonstrate the speedup obtained by the closed form solutions. Our results support the theoretical analysis of this improvement. CONCLUSIONS: These improvements enable us to increase substantially the scale of inputs analyzed by the method, allowing us to apply the new approach to data sets that could not be analyzed before. BioMed Central 2020-04-16 /pmc/articles/PMC7161103/ /pubmed/32299339 http://dx.doi.org/10.1186/s12864-020-6606-0 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Snir, Sagi
Epigenetic pacemaker: closed form algebraic solutions
title Epigenetic pacemaker: closed form algebraic solutions
title_full Epigenetic pacemaker: closed form algebraic solutions
title_fullStr Epigenetic pacemaker: closed form algebraic solutions
title_full_unstemmed Epigenetic pacemaker: closed form algebraic solutions
title_short Epigenetic pacemaker: closed form algebraic solutions
title_sort epigenetic pacemaker: closed form algebraic solutions
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7161103/
https://www.ncbi.nlm.nih.gov/pubmed/32299339
http://dx.doi.org/10.1186/s12864-020-6606-0
work_keys_str_mv AT snirsagi epigeneticpacemakerclosedformalgebraicsolutions