Cargando…

Identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data

BACKGROUND: The traditional approach to studying the epigenetic mechanism CpG methylation in tissue samples is to identify regions of concordant differential methylation spanning multiple CpG sites (differentially methylated regions). Variation limited to single or small numbers of CpGs has been ass...

Descripción completa

Detalles Bibliográficos
Autores principales: Scott, C. Anthony, Duryea, Jack D., MacKay, Harry, Baker, Maria S., Laritsky, Eleonora, Gunasekara, Chathura J., Coarfa, Cristian, Waterland, Robert A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7329512/
https://www.ncbi.nlm.nih.gov/pubmed/32605651
http://dx.doi.org/10.1186/s13059-020-02065-5
_version_ 1783552920604639232
author Scott, C. Anthony
Duryea, Jack D.
MacKay, Harry
Baker, Maria S.
Laritsky, Eleonora
Gunasekara, Chathura J.
Coarfa, Cristian
Waterland, Robert A.
author_facet Scott, C. Anthony
Duryea, Jack D.
MacKay, Harry
Baker, Maria S.
Laritsky, Eleonora
Gunasekara, Chathura J.
Coarfa, Cristian
Waterland, Robert A.
author_sort Scott, C. Anthony
collection PubMed
description BACKGROUND: The traditional approach to studying the epigenetic mechanism CpG methylation in tissue samples is to identify regions of concordant differential methylation spanning multiple CpG sites (differentially methylated regions). Variation limited to single or small numbers of CpGs has been assumed to reflect stochastic processes. To test this, we developed software, Cluster-Based analysis of CpG methylation (CluBCpG), and explored variation in read-level CpG methylation patterns in whole genome bisulfite sequencing data. RESULTS: Analysis of both human and mouse whole genome bisulfite sequencing datasets reveals read-level signatures associated with cell type and cell type-specific biological processes. These signatures, which are mostly orthogonal to classical differentially methylated regions, are enriched at cell type-specific enhancers and allow estimation of proportional cell composition in synthetic mixtures and improved prediction of gene expression. In tandem, we developed a machine learning algorithm, Precise Read-Level Imputation of Methylation (PReLIM), to increase coverage of existing whole genome bisulfite sequencing datasets by imputing CpG methylation states on individual sequencing reads. PReLIM both improves CluBCpG coverage and performance and enables identification of novel differentially methylated regions, which we independently validate. CONCLUSIONS: Our data indicate that, rather than stochastic variation, read-level CpG methylation patterns in tissue whole genome bisulfite sequencing libraries reflect cell type. Accordingly, these new computational tools should lead to an improved understanding of epigenetic regulation by DNA methylation.
format Online
Article
Text
id pubmed-7329512
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-73295122020-07-02 Identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data Scott, C. Anthony Duryea, Jack D. MacKay, Harry Baker, Maria S. Laritsky, Eleonora Gunasekara, Chathura J. Coarfa, Cristian Waterland, Robert A. Genome Biol Research BACKGROUND: The traditional approach to studying the epigenetic mechanism CpG methylation in tissue samples is to identify regions of concordant differential methylation spanning multiple CpG sites (differentially methylated regions). Variation limited to single or small numbers of CpGs has been assumed to reflect stochastic processes. To test this, we developed software, Cluster-Based analysis of CpG methylation (CluBCpG), and explored variation in read-level CpG methylation patterns in whole genome bisulfite sequencing data. RESULTS: Analysis of both human and mouse whole genome bisulfite sequencing datasets reveals read-level signatures associated with cell type and cell type-specific biological processes. These signatures, which are mostly orthogonal to classical differentially methylated regions, are enriched at cell type-specific enhancers and allow estimation of proportional cell composition in synthetic mixtures and improved prediction of gene expression. In tandem, we developed a machine learning algorithm, Precise Read-Level Imputation of Methylation (PReLIM), to increase coverage of existing whole genome bisulfite sequencing datasets by imputing CpG methylation states on individual sequencing reads. PReLIM both improves CluBCpG coverage and performance and enables identification of novel differentially methylated regions, which we independently validate. CONCLUSIONS: Our data indicate that, rather than stochastic variation, read-level CpG methylation patterns in tissue whole genome bisulfite sequencing libraries reflect cell type. Accordingly, these new computational tools should lead to an improved understanding of epigenetic regulation by DNA methylation. BioMed Central 2020-07-01 /pmc/articles/PMC7329512/ /pubmed/32605651 http://dx.doi.org/10.1186/s13059-020-02065-5 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Scott, C. Anthony
Duryea, Jack D.
MacKay, Harry
Baker, Maria S.
Laritsky, Eleonora
Gunasekara, Chathura J.
Coarfa, Cristian
Waterland, Robert A.
Identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data
title Identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data
title_full Identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data
title_fullStr Identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data
title_full_unstemmed Identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data
title_short Identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data
title_sort identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7329512/
https://www.ncbi.nlm.nih.gov/pubmed/32605651
http://dx.doi.org/10.1186/s13059-020-02065-5
work_keys_str_mv AT scottcanthony identificationofcelltypespecificmethylationsignalsinbulkwholegenomebisulfitesequencingdata
AT duryeajackd identificationofcelltypespecificmethylationsignalsinbulkwholegenomebisulfitesequencingdata
AT mackayharry identificationofcelltypespecificmethylationsignalsinbulkwholegenomebisulfitesequencingdata
AT bakermarias identificationofcelltypespecificmethylationsignalsinbulkwholegenomebisulfitesequencingdata
AT laritskyeleonora identificationofcelltypespecificmethylationsignalsinbulkwholegenomebisulfitesequencingdata
AT gunasekarachathuraj identificationofcelltypespecificmethylationsignalsinbulkwholegenomebisulfitesequencingdata
AT coarfacristian identificationofcelltypespecificmethylationsignalsinbulkwholegenomebisulfitesequencingdata
AT waterlandroberta identificationofcelltypespecificmethylationsignalsinbulkwholegenomebisulfitesequencingdata