Cargando…

PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments

As chromatin accessibility data from ATAC-seq experiments continues to expand, there is continuing need for standardized analysis pipelines. Here, we present PEPATAC, an ATAC-seq pipeline that is easily applied to ATAC-seq projects of any size, from one-off experiments to large-scale sequencing proj...

Descripción completa

Detalles Bibliográficos
Autores principales: Smith, Jason P, Corces, M Ryan, Xu, Jin, Reuter, Vincent P, Chang, Howard Y, Sheffield, Nathan C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8632735/
https://www.ncbi.nlm.nih.gov/pubmed/34859208
http://dx.doi.org/10.1093/nargab/lqab101
_version_ 1784607808713392128
author Smith, Jason P
Corces, M Ryan
Xu, Jin
Reuter, Vincent P
Chang, Howard Y
Sheffield, Nathan C
author_facet Smith, Jason P
Corces, M Ryan
Xu, Jin
Reuter, Vincent P
Chang, Howard Y
Sheffield, Nathan C
author_sort Smith, Jason P
collection PubMed
description As chromatin accessibility data from ATAC-seq experiments continues to expand, there is continuing need for standardized analysis pipelines. Here, we present PEPATAC, an ATAC-seq pipeline that is easily applied to ATAC-seq projects of any size, from one-off experiments to large-scale sequencing projects. PEPATAC leverages unique features of ATAC-seq data to optimize for speed and accuracy, and it provides several unique analytical approaches. Output includes convenient quality control plots, summary statistics, and a variety of generally useful data formats to set the groundwork for subsequent project-specific data analysis. Downstream analysis is simplified by a standard definition format, modularity of components, and metadata APIs in R and Python. It is restartable, fault-tolerant, and can be run on local hardware, using any cluster resource manager, or in provided Linux containers. We also demonstrate the advantage of aligning to the mitochondrial genome serially, which improves the accuracy of alignment statistics and quality control metrics. PEPATAC is a robust and portable first step for any ATAC-seq project. BSD2-licensed code and documentation are available at https://pepatac.databio.org.
format Online
Article
Text
id pubmed-8632735
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-86327352021-12-01 PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments Smith, Jason P Corces, M Ryan Xu, Jin Reuter, Vincent P Chang, Howard Y Sheffield, Nathan C NAR Genom Bioinform APP Notes As chromatin accessibility data from ATAC-seq experiments continues to expand, there is continuing need for standardized analysis pipelines. Here, we present PEPATAC, an ATAC-seq pipeline that is easily applied to ATAC-seq projects of any size, from one-off experiments to large-scale sequencing projects. PEPATAC leverages unique features of ATAC-seq data to optimize for speed and accuracy, and it provides several unique analytical approaches. Output includes convenient quality control plots, summary statistics, and a variety of generally useful data formats to set the groundwork for subsequent project-specific data analysis. Downstream analysis is simplified by a standard definition format, modularity of components, and metadata APIs in R and Python. It is restartable, fault-tolerant, and can be run on local hardware, using any cluster resource manager, or in provided Linux containers. We also demonstrate the advantage of aligning to the mitochondrial genome serially, which improves the accuracy of alignment statistics and quality control metrics. PEPATAC is a robust and portable first step for any ATAC-seq project. BSD2-licensed code and documentation are available at https://pepatac.databio.org. Oxford University Press 2021-11-23 /pmc/articles/PMC8632735/ /pubmed/34859208 http://dx.doi.org/10.1093/nargab/lqab101 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle APP Notes
Smith, Jason P
Corces, M Ryan
Xu, Jin
Reuter, Vincent P
Chang, Howard Y
Sheffield, Nathan C
PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments
title PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments
title_full PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments
title_fullStr PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments
title_full_unstemmed PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments
title_short PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments
title_sort pepatac: an optimized pipeline for atac-seq data analysis with serial alignments
topic APP Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8632735/
https://www.ncbi.nlm.nih.gov/pubmed/34859208
http://dx.doi.org/10.1093/nargab/lqab101
work_keys_str_mv AT smithjasonp pepatacanoptimizedpipelineforatacseqdataanalysiswithserialalignments
AT corcesmryan pepatacanoptimizedpipelineforatacseqdataanalysiswithserialalignments
AT xujin pepatacanoptimizedpipelineforatacseqdataanalysiswithserialalignments
AT reutervincentp pepatacanoptimizedpipelineforatacseqdataanalysiswithserialalignments
AT changhowardy pepatacanoptimizedpipelineforatacseqdataanalysiswithserialalignments
AT sheffieldnathanc pepatacanoptimizedpipelineforatacseqdataanalysiswithserialalignments