Cargando…

wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data

BACKGROUND: Analysing whole genome bisulfite sequencing datasets is a data-intensive task that requires comprehensive and reproducible workflows to generate valid results. While many algorithms have been developed for tasks such as alignment, comprehensive end-to-end pipelines are still sparse. Furt...

Descripción completa

Detalles Bibliográficos
Autores principales: Wöste, Marius, Leitão, Elsa, Laurentino, Sandra, Horsthemke, Bernhard, Rahmann, Sven, Schröder, Christopher
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7195798/
https://www.ncbi.nlm.nih.gov/pubmed/32357829
http://dx.doi.org/10.1186/s12859-020-3470-5
_version_ 1783528612602839040
author Wöste, Marius
Leitão, Elsa
Laurentino, Sandra
Horsthemke, Bernhard
Rahmann, Sven
Schröder, Christopher
author_facet Wöste, Marius
Leitão, Elsa
Laurentino, Sandra
Horsthemke, Bernhard
Rahmann, Sven
Schröder, Christopher
author_sort Wöste, Marius
collection PubMed
description BACKGROUND: Analysing whole genome bisulfite sequencing datasets is a data-intensive task that requires comprehensive and reproducible workflows to generate valid results. While many algorithms have been developed for tasks such as alignment, comprehensive end-to-end pipelines are still sparse. Furthermore, previous pipelines lack features or show technical deficiencies, thus impeding analyses. RESULTS: We developed wg-blimp (whole genome bisulfite sequencing methylation analysis pipeline) as an end-to-end pipeline to ease whole genome bisulfite sequencing data analysis. It integrates established algorithms for alignment, quality control, methylation calling, detection of differentially methylated regions, and methylome segmentation, requiring only a reference genome and raw sequencing data as input. Comparing wg-blimp to previous end-to-end pipelines reveals similar setups for common sequence processing tasks, but shows differences for post-alignment analyses. We improve on previous pipelines by providing a more comprehensive analysis workflow as well as an interactive user interface. To demonstrate wg-blimp’s ability to produce correct results we used it to call differentially methylated regions for two publicly available datasets. We were able to replicate 112 of 114 previously published regions, and found results to be consistent with previous findings. We further applied wg-blimp to a publicly available sample of embryonic stem cells to showcase methylome segmentation. As expected, unmethylated regions were in close proximity of transcription start sites. Segmentation results were consistent with previous analyses, despite different reference genomes and sequencing techniques. CONCLUSIONS: wg-blimp provides a comprehensive analysis pipeline for whole genome bisulfite sequencing data as well as a user interface for simplified result inspection. We demonstrated its applicability by analysing multiple publicly available datasets. Thus, wg-blimp is a relevant alternative to previous analysis pipelines and may facilitate future epigenetic research.
format Online
Article
Text
id pubmed-7195798
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-71957982020-05-06 wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data Wöste, Marius Leitão, Elsa Laurentino, Sandra Horsthemke, Bernhard Rahmann, Sven Schröder, Christopher BMC Bioinformatics Software BACKGROUND: Analysing whole genome bisulfite sequencing datasets is a data-intensive task that requires comprehensive and reproducible workflows to generate valid results. While many algorithms have been developed for tasks such as alignment, comprehensive end-to-end pipelines are still sparse. Furthermore, previous pipelines lack features or show technical deficiencies, thus impeding analyses. RESULTS: We developed wg-blimp (whole genome bisulfite sequencing methylation analysis pipeline) as an end-to-end pipeline to ease whole genome bisulfite sequencing data analysis. It integrates established algorithms for alignment, quality control, methylation calling, detection of differentially methylated regions, and methylome segmentation, requiring only a reference genome and raw sequencing data as input. Comparing wg-blimp to previous end-to-end pipelines reveals similar setups for common sequence processing tasks, but shows differences for post-alignment analyses. We improve on previous pipelines by providing a more comprehensive analysis workflow as well as an interactive user interface. To demonstrate wg-blimp’s ability to produce correct results we used it to call differentially methylated regions for two publicly available datasets. We were able to replicate 112 of 114 previously published regions, and found results to be consistent with previous findings. We further applied wg-blimp to a publicly available sample of embryonic stem cells to showcase methylome segmentation. As expected, unmethylated regions were in close proximity of transcription start sites. Segmentation results were consistent with previous analyses, despite different reference genomes and sequencing techniques. CONCLUSIONS: wg-blimp provides a comprehensive analysis pipeline for whole genome bisulfite sequencing data as well as a user interface for simplified result inspection. We demonstrated its applicability by analysing multiple publicly available datasets. Thus, wg-blimp is a relevant alternative to previous analysis pipelines and may facilitate future epigenetic research. BioMed Central 2020-05-01 /pmc/articles/PMC7195798/ /pubmed/32357829 http://dx.doi.org/10.1186/s12859-020-3470-5 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Wöste, Marius
Leitão, Elsa
Laurentino, Sandra
Horsthemke, Bernhard
Rahmann, Sven
Schröder, Christopher
wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data
title wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data
title_full wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data
title_fullStr wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data
title_full_unstemmed wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data
title_short wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data
title_sort wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7195798/
https://www.ncbi.nlm.nih.gov/pubmed/32357829
http://dx.doi.org/10.1186/s12859-020-3470-5
work_keys_str_mv AT wostemarius wgblimpanendtoendanalysispipelineforwholegenomebisulfitesequencingdata
AT leitaoelsa wgblimpanendtoendanalysispipelineforwholegenomebisulfitesequencingdata
AT laurentinosandra wgblimpanendtoendanalysispipelineforwholegenomebisulfitesequencingdata
AT horsthemkebernhard wgblimpanendtoendanalysispipelineforwholegenomebisulfitesequencingdata
AT rahmannsven wgblimpanendtoendanalysispipelineforwholegenomebisulfitesequencingdata
AT schroderchristopher wgblimpanendtoendanalysispipelineforwholegenomebisulfitesequencingdata