Cargando…
Epigenetic pacemaker: closed form algebraic solutions
BACKGROUND: DNA methylation is widely used as a biomarker in crucial medical applications as well as for human age prediction of very high accuracy. This biomarker is based on the methylation status of several hundred CpG sites. In a recent line of publications we have adapted a versatile concept fr...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7161103/ https://www.ncbi.nlm.nih.gov/pubmed/32299339 http://dx.doi.org/10.1186/s12864-020-6606-0 |
_version_ | 1783522889464545280 |
---|---|
author | Snir, Sagi |
author_facet | Snir, Sagi |
author_sort | Snir, Sagi |
collection | PubMed |
description | BACKGROUND: DNA methylation is widely used as a biomarker in crucial medical applications as well as for human age prediction of very high accuracy. This biomarker is based on the methylation status of several hundred CpG sites. In a recent line of publications we have adapted a versatile concept from evolutionary biology - the Universal Pacemaker (UPM) - to the setting of epigenetic aging and denoted it the Epigenetic PaceMaker (EPM). The EPM, as opposed to other epigenetic clocks, is not confined to specific pattern of aging, and the epigenetic age of the individual is inferred independently of other individuals. This allows an explicit modeling of aging trends, in particular non linear relationship between chronological and epigenetic age. In one of these recent works, we have presented an algorithmic improvement based on a two-step conditional expectation maximization (CEM) algorithm to arrive at a critical point on the likelihood surface. The algorithm alternates between a time step and a site step while advancing on the likelihood surface. RESULTS: Here we introduce non trivial improvements to these steps that are essential for analyzing data sets of realistic magnitude in a manageable time and space. These structural improvements are based on insights from linear algebra and symbolic algebra tools, providing us greater understanding of the degeneracy of the complex problem space. This understanding in turn, leads to the complete elimination of the bottleneck of cumbersome matrix multiplication and inversion, yielding a fast closed form solution in both steps of the CEM.In the experimental results part, we compare the CEM algorithm over several data sets and demonstrate the speedup obtained by the closed form solutions. Our results support the theoretical analysis of this improvement. CONCLUSIONS: These improvements enable us to increase substantially the scale of inputs analyzed by the method, allowing us to apply the new approach to data sets that could not be analyzed before. |
format | Online Article Text |
id | pubmed-7161103 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-71611032020-04-22 Epigenetic pacemaker: closed form algebraic solutions Snir, Sagi BMC Genomics Research BACKGROUND: DNA methylation is widely used as a biomarker in crucial medical applications as well as for human age prediction of very high accuracy. This biomarker is based on the methylation status of several hundred CpG sites. In a recent line of publications we have adapted a versatile concept from evolutionary biology - the Universal Pacemaker (UPM) - to the setting of epigenetic aging and denoted it the Epigenetic PaceMaker (EPM). The EPM, as opposed to other epigenetic clocks, is not confined to specific pattern of aging, and the epigenetic age of the individual is inferred independently of other individuals. This allows an explicit modeling of aging trends, in particular non linear relationship between chronological and epigenetic age. In one of these recent works, we have presented an algorithmic improvement based on a two-step conditional expectation maximization (CEM) algorithm to arrive at a critical point on the likelihood surface. The algorithm alternates between a time step and a site step while advancing on the likelihood surface. RESULTS: Here we introduce non trivial improvements to these steps that are essential for analyzing data sets of realistic magnitude in a manageable time and space. These structural improvements are based on insights from linear algebra and symbolic algebra tools, providing us greater understanding of the degeneracy of the complex problem space. This understanding in turn, leads to the complete elimination of the bottleneck of cumbersome matrix multiplication and inversion, yielding a fast closed form solution in both steps of the CEM.In the experimental results part, we compare the CEM algorithm over several data sets and demonstrate the speedup obtained by the closed form solutions. Our results support the theoretical analysis of this improvement. CONCLUSIONS: These improvements enable us to increase substantially the scale of inputs analyzed by the method, allowing us to apply the new approach to data sets that could not be analyzed before. BioMed Central 2020-04-16 /pmc/articles/PMC7161103/ /pubmed/32299339 http://dx.doi.org/10.1186/s12864-020-6606-0 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Snir, Sagi Epigenetic pacemaker: closed form algebraic solutions |
title | Epigenetic pacemaker: closed form algebraic solutions |
title_full | Epigenetic pacemaker: closed form algebraic solutions |
title_fullStr | Epigenetic pacemaker: closed form algebraic solutions |
title_full_unstemmed | Epigenetic pacemaker: closed form algebraic solutions |
title_short | Epigenetic pacemaker: closed form algebraic solutions |
title_sort | epigenetic pacemaker: closed form algebraic solutions |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7161103/ https://www.ncbi.nlm.nih.gov/pubmed/32299339 http://dx.doi.org/10.1186/s12864-020-6606-0 |
work_keys_str_mv | AT snirsagi epigeneticpacemakerclosedformalgebraicsolutions |