Cargando…

BAMscale: quantification of next-generation sequencing peaks and generation of scaled coverage tracks

BACKGROUND: Next-generation sequencing allows genome-wide analysis of changes in chromatin states and gene expression. Data analysis of these increasingly used methods either requires multiple analysis steps, or extensive computational time. We sought to develop a tool for rapid quantification of se...

Descripción completa

Detalles Bibliográficos
Autores principales: Pongor, Lorinc S., Gross, Jacob M., Vera Alvarez, Roberto, Murai, Junko, Jang, Sang-Min, Zhang, Hongliang, Redon, Christophe, Fu, Haiqing, Huang, Shar-Yin, Thakur, Bhushan, Baris, Adrian, Marino-Ramirez, Leonardo, Landsman, David, Aladjem, Mirit I., Pommier, Yves
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7175505/
https://www.ncbi.nlm.nih.gov/pubmed/32321568
http://dx.doi.org/10.1186/s13072-020-00343-x
_version_ 1783524845033619456
author Pongor, Lorinc S.
Gross, Jacob M.
Vera Alvarez, Roberto
Murai, Junko
Jang, Sang-Min
Zhang, Hongliang
Redon, Christophe
Fu, Haiqing
Huang, Shar-Yin
Thakur, Bhushan
Baris, Adrian
Marino-Ramirez, Leonardo
Landsman, David
Aladjem, Mirit I.
Pommier, Yves
author_facet Pongor, Lorinc S.
Gross, Jacob M.
Vera Alvarez, Roberto
Murai, Junko
Jang, Sang-Min
Zhang, Hongliang
Redon, Christophe
Fu, Haiqing
Huang, Shar-Yin
Thakur, Bhushan
Baris, Adrian
Marino-Ramirez, Leonardo
Landsman, David
Aladjem, Mirit I.
Pommier, Yves
author_sort Pongor, Lorinc S.
collection PubMed
description BACKGROUND: Next-generation sequencing allows genome-wide analysis of changes in chromatin states and gene expression. Data analysis of these increasingly used methods either requires multiple analysis steps, or extensive computational time. We sought to develop a tool for rapid quantification of sequencing peaks from diverse experimental sources and an efficient method to produce coverage tracks for accurate visualization that can be intuitively displayed and interpreted by experimentalists with minimal bioinformatics background. We demonstrate its strength and usability by integrating data from several types of sequencing approaches. RESULTS: We have developed BAMscale, a one-step tool that processes a wide set of sequencing datasets. To demonstrate the usefulness of BAMscale, we analyzed multiple sequencing datasets from chromatin immunoprecipitation sequencing data (ChIP-seq), chromatin state change data (assay for transposase-accessible chromatin using sequencing: ATAC-seq, DNA double-strand break mapping sequencing: END-seq), DNA replication data (Okazaki fragments sequencing: OK-seq, nascent-strand sequencing: NS-seq, single-cell replication timing sequencing: scRepli-seq) and RNA-seq data. The outputs consist of raw and normalized peak scores (multiple normalizations) in text format and scaled bigWig coverage tracks that are directly accessible to data visualization programs. BAMScale also includes a visualization module facilitating direct, on-demand quantitative peak comparisons that can be used by experimentalists. Our tool can effectively analyze large sequencing datasets (~ 100 Gb size) in minutes, outperforming currently available tools. CONCLUSIONS: BAMscale accurately quantifies and normalizes identified peaks directly from BAM files, and creates coverage tracks for visualization in genome browsers. BAMScale can be implemented for a wide set of methods for calculating coverage tracks, including ChIP-seq and ATAC-seq, as well as methods that currently require specialized, separate tools for analyses, such as splice-aware RNA-seq, END-seq and OK-seq for which no dedicated software is available. BAMscale is freely available on github (https://github.com/ncbi/BAMscale).
format Online
Article
Text
id pubmed-7175505
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-71755052020-04-24 BAMscale: quantification of next-generation sequencing peaks and generation of scaled coverage tracks Pongor, Lorinc S. Gross, Jacob M. Vera Alvarez, Roberto Murai, Junko Jang, Sang-Min Zhang, Hongliang Redon, Christophe Fu, Haiqing Huang, Shar-Yin Thakur, Bhushan Baris, Adrian Marino-Ramirez, Leonardo Landsman, David Aladjem, Mirit I. Pommier, Yves Epigenetics Chromatin Methodology BACKGROUND: Next-generation sequencing allows genome-wide analysis of changes in chromatin states and gene expression. Data analysis of these increasingly used methods either requires multiple analysis steps, or extensive computational time. We sought to develop a tool for rapid quantification of sequencing peaks from diverse experimental sources and an efficient method to produce coverage tracks for accurate visualization that can be intuitively displayed and interpreted by experimentalists with minimal bioinformatics background. We demonstrate its strength and usability by integrating data from several types of sequencing approaches. RESULTS: We have developed BAMscale, a one-step tool that processes a wide set of sequencing datasets. To demonstrate the usefulness of BAMscale, we analyzed multiple sequencing datasets from chromatin immunoprecipitation sequencing data (ChIP-seq), chromatin state change data (assay for transposase-accessible chromatin using sequencing: ATAC-seq, DNA double-strand break mapping sequencing: END-seq), DNA replication data (Okazaki fragments sequencing: OK-seq, nascent-strand sequencing: NS-seq, single-cell replication timing sequencing: scRepli-seq) and RNA-seq data. The outputs consist of raw and normalized peak scores (multiple normalizations) in text format and scaled bigWig coverage tracks that are directly accessible to data visualization programs. BAMScale also includes a visualization module facilitating direct, on-demand quantitative peak comparisons that can be used by experimentalists. Our tool can effectively analyze large sequencing datasets (~ 100 Gb size) in minutes, outperforming currently available tools. CONCLUSIONS: BAMscale accurately quantifies and normalizes identified peaks directly from BAM files, and creates coverage tracks for visualization in genome browsers. BAMScale can be implemented for a wide set of methods for calculating coverage tracks, including ChIP-seq and ATAC-seq, as well as methods that currently require specialized, separate tools for analyses, such as splice-aware RNA-seq, END-seq and OK-seq for which no dedicated software is available. BAMscale is freely available on github (https://github.com/ncbi/BAMscale). BioMed Central 2020-04-22 /pmc/articles/PMC7175505/ /pubmed/32321568 http://dx.doi.org/10.1186/s13072-020-00343-x Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology
Pongor, Lorinc S.
Gross, Jacob M.
Vera Alvarez, Roberto
Murai, Junko
Jang, Sang-Min
Zhang, Hongliang
Redon, Christophe
Fu, Haiqing
Huang, Shar-Yin
Thakur, Bhushan
Baris, Adrian
Marino-Ramirez, Leonardo
Landsman, David
Aladjem, Mirit I.
Pommier, Yves
BAMscale: quantification of next-generation sequencing peaks and generation of scaled coverage tracks
title BAMscale: quantification of next-generation sequencing peaks and generation of scaled coverage tracks
title_full BAMscale: quantification of next-generation sequencing peaks and generation of scaled coverage tracks
title_fullStr BAMscale: quantification of next-generation sequencing peaks and generation of scaled coverage tracks
title_full_unstemmed BAMscale: quantification of next-generation sequencing peaks and generation of scaled coverage tracks
title_short BAMscale: quantification of next-generation sequencing peaks and generation of scaled coverage tracks
title_sort bamscale: quantification of next-generation sequencing peaks and generation of scaled coverage tracks
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7175505/
https://www.ncbi.nlm.nih.gov/pubmed/32321568
http://dx.doi.org/10.1186/s13072-020-00343-x
work_keys_str_mv AT pongorlorincs bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT grossjacobm bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT veraalvarezroberto bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT muraijunko bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT jangsangmin bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT zhanghongliang bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT redonchristophe bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT fuhaiqing bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT huangsharyin bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT thakurbhushan bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT barisadrian bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT marinoramirezleonardo bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT landsmandavid bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT aladjemmiriti bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks
AT pommieryves bamscalequantificationofnextgenerationsequencingpeaksandgenerationofscaledcoveragetracks