Cargando…

BSImp: Imputing Partially Observed Methylation Patterns for Evaluating Methylation Heterogeneity

DNA methylation is one of the most studied epigenetic modifications that has applications ranging from transcriptional regulation to aging, and can be assessed by bisulfite sequencing (BS-seq) or enzymatic methyl sequencing (EM-seq) at single base-pair resolution. The permutations of methylation sta...

Descripción completa

Detalles Bibliográficos
Autores principales: Chang, Ya-Ting Sabrina, Yen, Ming-Ren, Chen, Pao-Yang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580945/
https://www.ncbi.nlm.nih.gov/pubmed/36304331
http://dx.doi.org/10.3389/fbinf.2022.815289
_version_ 1784812506589429760
author Chang, Ya-Ting Sabrina
Yen, Ming-Ren
Chen, Pao-Yang
author_facet Chang, Ya-Ting Sabrina
Yen, Ming-Ren
Chen, Pao-Yang
author_sort Chang, Ya-Ting Sabrina
collection PubMed
description DNA methylation is one of the most studied epigenetic modifications that has applications ranging from transcriptional regulation to aging, and can be assessed by bisulfite sequencing (BS-seq) or enzymatic methyl sequencing (EM-seq) at single base-pair resolution. The permutations of methylation statuses given by aligned reads reflect the methylation patterns of individual cells. These patterns at specific genomic locations are sought to be indicative of cellular heterogeneity within a cellular population, which are predictive of developments and diseases; therefore, methylation heterogeneity has potentials in early detection of these changes. Computational methods have been developed to assess methylation heterogeneity using methylation patterns formed by four consecutive CpGs, but the nature of shotgun sequencing often give partially observed patterns, which makes very limited data available for downstream analysis. While many programs are developed to impute genome-wide methylation levels, currently there is only one method developed for recovering partially observed methylation patterns; however, the program needs lots of data to train and cannot be used directly; therefore, we developed a probabilistic-based imputation method that uses information from neighbouring sites to recover partially observed methylation patterns speedily. It is demonstrated to allow for the evaluation of methylation heterogeneity at 15% more regions genome-wide with high accuracy for data with moderate depth. To make it more user-friendly we also provide a computational pipeline for genome-screening, which can be used in both evaluating methylation levels and profiling methylation patterns genomewide for all cytosine contexts, which is the first of its kind. Our method allows for accurate estimation of methylation levels and makes evaluating methylation heterogeneity available for much more data with reasonable coverage, which has important implications in using methylation heterogeneity for monitoring changes within the cellular populations that were impossible to detect for the assessment of development and diseases.
format Online
Article
Text
id pubmed-9580945
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95809452022-10-26 BSImp: Imputing Partially Observed Methylation Patterns for Evaluating Methylation Heterogeneity Chang, Ya-Ting Sabrina Yen, Ming-Ren Chen, Pao-Yang Front Bioinform Bioinformatics DNA methylation is one of the most studied epigenetic modifications that has applications ranging from transcriptional regulation to aging, and can be assessed by bisulfite sequencing (BS-seq) or enzymatic methyl sequencing (EM-seq) at single base-pair resolution. The permutations of methylation statuses given by aligned reads reflect the methylation patterns of individual cells. These patterns at specific genomic locations are sought to be indicative of cellular heterogeneity within a cellular population, which are predictive of developments and diseases; therefore, methylation heterogeneity has potentials in early detection of these changes. Computational methods have been developed to assess methylation heterogeneity using methylation patterns formed by four consecutive CpGs, but the nature of shotgun sequencing often give partially observed patterns, which makes very limited data available for downstream analysis. While many programs are developed to impute genome-wide methylation levels, currently there is only one method developed for recovering partially observed methylation patterns; however, the program needs lots of data to train and cannot be used directly; therefore, we developed a probabilistic-based imputation method that uses information from neighbouring sites to recover partially observed methylation patterns speedily. It is demonstrated to allow for the evaluation of methylation heterogeneity at 15% more regions genome-wide with high accuracy for data with moderate depth. To make it more user-friendly we also provide a computational pipeline for genome-screening, which can be used in both evaluating methylation levels and profiling methylation patterns genomewide for all cytosine contexts, which is the first of its kind. Our method allows for accurate estimation of methylation levels and makes evaluating methylation heterogeneity available for much more data with reasonable coverage, which has important implications in using methylation heterogeneity for monitoring changes within the cellular populations that were impossible to detect for the assessment of development and diseases. Frontiers Media S.A. 2022-02-10 /pmc/articles/PMC9580945/ /pubmed/36304331 http://dx.doi.org/10.3389/fbinf.2022.815289 Text en Copyright © 2022 Chang, Yen and Chen. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioinformatics
Chang, Ya-Ting Sabrina
Yen, Ming-Ren
Chen, Pao-Yang
BSImp: Imputing Partially Observed Methylation Patterns for Evaluating Methylation Heterogeneity
title BSImp: Imputing Partially Observed Methylation Patterns for Evaluating Methylation Heterogeneity
title_full BSImp: Imputing Partially Observed Methylation Patterns for Evaluating Methylation Heterogeneity
title_fullStr BSImp: Imputing Partially Observed Methylation Patterns for Evaluating Methylation Heterogeneity
title_full_unstemmed BSImp: Imputing Partially Observed Methylation Patterns for Evaluating Methylation Heterogeneity
title_short BSImp: Imputing Partially Observed Methylation Patterns for Evaluating Methylation Heterogeneity
title_sort bsimp: imputing partially observed methylation patterns for evaluating methylation heterogeneity
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580945/
https://www.ncbi.nlm.nih.gov/pubmed/36304331
http://dx.doi.org/10.3389/fbinf.2022.815289
work_keys_str_mv AT changyatingsabrina bsimpimputingpartiallyobservedmethylationpatternsforevaluatingmethylationheterogeneity
AT yenmingren bsimpimputingpartiallyobservedmethylationpatternsforevaluatingmethylationheterogeneity
AT chenpaoyang bsimpimputingpartiallyobservedmethylationpatternsforevaluatingmethylationheterogeneity