Cargando…
PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments
As chromatin accessibility data from ATAC-seq experiments continues to expand, there is continuing need for standardized analysis pipelines. Here, we present PEPATAC, an ATAC-seq pipeline that is easily applied to ATAC-seq projects of any size, from one-off experiments to large-scale sequencing proj...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8632735/ https://www.ncbi.nlm.nih.gov/pubmed/34859208 http://dx.doi.org/10.1093/nargab/lqab101 |
_version_ | 1784607808713392128 |
---|---|
author | Smith, Jason P Corces, M Ryan Xu, Jin Reuter, Vincent P Chang, Howard Y Sheffield, Nathan C |
author_facet | Smith, Jason P Corces, M Ryan Xu, Jin Reuter, Vincent P Chang, Howard Y Sheffield, Nathan C |
author_sort | Smith, Jason P |
collection | PubMed |
description | As chromatin accessibility data from ATAC-seq experiments continues to expand, there is continuing need for standardized analysis pipelines. Here, we present PEPATAC, an ATAC-seq pipeline that is easily applied to ATAC-seq projects of any size, from one-off experiments to large-scale sequencing projects. PEPATAC leverages unique features of ATAC-seq data to optimize for speed and accuracy, and it provides several unique analytical approaches. Output includes convenient quality control plots, summary statistics, and a variety of generally useful data formats to set the groundwork for subsequent project-specific data analysis. Downstream analysis is simplified by a standard definition format, modularity of components, and metadata APIs in R and Python. It is restartable, fault-tolerant, and can be run on local hardware, using any cluster resource manager, or in provided Linux containers. We also demonstrate the advantage of aligning to the mitochondrial genome serially, which improves the accuracy of alignment statistics and quality control metrics. PEPATAC is a robust and portable first step for any ATAC-seq project. BSD2-licensed code and documentation are available at https://pepatac.databio.org. |
format | Online Article Text |
id | pubmed-8632735 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-86327352021-12-01 PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments Smith, Jason P Corces, M Ryan Xu, Jin Reuter, Vincent P Chang, Howard Y Sheffield, Nathan C NAR Genom Bioinform APP Notes As chromatin accessibility data from ATAC-seq experiments continues to expand, there is continuing need for standardized analysis pipelines. Here, we present PEPATAC, an ATAC-seq pipeline that is easily applied to ATAC-seq projects of any size, from one-off experiments to large-scale sequencing projects. PEPATAC leverages unique features of ATAC-seq data to optimize for speed and accuracy, and it provides several unique analytical approaches. Output includes convenient quality control plots, summary statistics, and a variety of generally useful data formats to set the groundwork for subsequent project-specific data analysis. Downstream analysis is simplified by a standard definition format, modularity of components, and metadata APIs in R and Python. It is restartable, fault-tolerant, and can be run on local hardware, using any cluster resource manager, or in provided Linux containers. We also demonstrate the advantage of aligning to the mitochondrial genome serially, which improves the accuracy of alignment statistics and quality control metrics. PEPATAC is a robust and portable first step for any ATAC-seq project. BSD2-licensed code and documentation are available at https://pepatac.databio.org. Oxford University Press 2021-11-23 /pmc/articles/PMC8632735/ /pubmed/34859208 http://dx.doi.org/10.1093/nargab/lqab101 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | APP Notes Smith, Jason P Corces, M Ryan Xu, Jin Reuter, Vincent P Chang, Howard Y Sheffield, Nathan C PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments |
title | PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments |
title_full | PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments |
title_fullStr | PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments |
title_full_unstemmed | PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments |
title_short | PEPATAC: an optimized pipeline for ATAC-seq data analysis with serial alignments |
title_sort | pepatac: an optimized pipeline for atac-seq data analysis with serial alignments |
topic | APP Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8632735/ https://www.ncbi.nlm.nih.gov/pubmed/34859208 http://dx.doi.org/10.1093/nargab/lqab101 |
work_keys_str_mv | AT smithjasonp pepatacanoptimizedpipelineforatacseqdataanalysiswithserialalignments AT corcesmryan pepatacanoptimizedpipelineforatacseqdataanalysiswithserialalignments AT xujin pepatacanoptimizedpipelineforatacseqdataanalysiswithserialalignments AT reutervincentp pepatacanoptimizedpipelineforatacseqdataanalysiswithserialalignments AT changhowardy pepatacanoptimizedpipelineforatacseqdataanalysiswithserialalignments AT sheffieldnathanc pepatacanoptimizedpipelineforatacseqdataanalysiswithserialalignments |