Cargando…

A workflow for simplified analysis of ATAC-cap-seq data in R

BACKGROUND: Assay for Transposase-Accessible Chromatin (ATAC)-cap-seq is a high-throughput sequencing method that combines ATAC-seq with targeted nucleic acid enrichment of precipitated DNA fragments. There are increased analytical difficulties arising from working with a set of regions of interest...

Descripción completa

Detalles Bibliográficos
Autores principales: Shrestha, Ram Krishna, Ding, Pingtao, Jones, Jonathan D G, MacLean, Dan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6047409/
https://www.ncbi.nlm.nih.gov/pubmed/29961827
http://dx.doi.org/10.1093/gigascience/giy080
_version_ 1783339940033069056
author Shrestha, Ram Krishna
Ding, Pingtao
Jones, Jonathan D G
MacLean, Dan
author_facet Shrestha, Ram Krishna
Ding, Pingtao
Jones, Jonathan D G
MacLean, Dan
author_sort Shrestha, Ram Krishna
collection PubMed
description BACKGROUND: Assay for Transposase-Accessible Chromatin (ATAC)-cap-seq is a high-throughput sequencing method that combines ATAC-seq with targeted nucleic acid enrichment of precipitated DNA fragments. There are increased analytical difficulties arising from working with a set of regions of interest that may be small in number and biologically dependent. Common statistical pipelines for RNA sequencing might be assumed to apply but can give misleading results on ATAC-cap-seq data. A tool is needed to allow a nonspecialist user to quickly and easily summarize data and apply sensible and effective normalization and analysis. RESULTS: We developed atacR to allow a user to easily analyze their ATAC enrichment experiment. It provides comprehensive summary functions and diagnostic plots for studying enriched tag abundance. Application of between-sample normalization is made straightforward. Functions for normalizing based on user-defined control regions, whole library size, and regions selected from the least variable regions in a dataset are provided. Three methods for detecting differential abundance of tags from enriched methods are provided, including bootstrap t, Bayes factor, and a wrapped version of the standard exact test in the edgeR package. We compared the precision, recall, and F-score of each detection method on resampled datasets at varying replicate, significance threshold, and genes changed and found that the Bayes factor method had the greatest overall detection power, though edgeR was slightly stronger in simulations with lower numbers of genes changed. CONCLUSIONS: Our package allows a nonspecialist user to easily and effectively apply methods appropriate to the analysis of ATAC-cap-seq in a reproducible manner. The package is implemented in pure R and is fully interoperable with common workflows in Bioconductor.
format Online
Article
Text
id pubmed-6047409
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-60474092018-07-19 A workflow for simplified analysis of ATAC-cap-seq data in R Shrestha, Ram Krishna Ding, Pingtao Jones, Jonathan D G MacLean, Dan Gigascience Technical Note BACKGROUND: Assay for Transposase-Accessible Chromatin (ATAC)-cap-seq is a high-throughput sequencing method that combines ATAC-seq with targeted nucleic acid enrichment of precipitated DNA fragments. There are increased analytical difficulties arising from working with a set of regions of interest that may be small in number and biologically dependent. Common statistical pipelines for RNA sequencing might be assumed to apply but can give misleading results on ATAC-cap-seq data. A tool is needed to allow a nonspecialist user to quickly and easily summarize data and apply sensible and effective normalization and analysis. RESULTS: We developed atacR to allow a user to easily analyze their ATAC enrichment experiment. It provides comprehensive summary functions and diagnostic plots for studying enriched tag abundance. Application of between-sample normalization is made straightforward. Functions for normalizing based on user-defined control regions, whole library size, and regions selected from the least variable regions in a dataset are provided. Three methods for detecting differential abundance of tags from enriched methods are provided, including bootstrap t, Bayes factor, and a wrapped version of the standard exact test in the edgeR package. We compared the precision, recall, and F-score of each detection method on resampled datasets at varying replicate, significance threshold, and genes changed and found that the Bayes factor method had the greatest overall detection power, though edgeR was slightly stronger in simulations with lower numbers of genes changed. CONCLUSIONS: Our package allows a nonspecialist user to easily and effectively apply methods appropriate to the analysis of ATAC-cap-seq in a reproducible manner. The package is implemented in pure R and is fully interoperable with common workflows in Bioconductor. Oxford University Press 2018-06-28 /pmc/articles/PMC6047409/ /pubmed/29961827 http://dx.doi.org/10.1093/gigascience/giy080 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Shrestha, Ram Krishna
Ding, Pingtao
Jones, Jonathan D G
MacLean, Dan
A workflow for simplified analysis of ATAC-cap-seq data in R
title A workflow for simplified analysis of ATAC-cap-seq data in R
title_full A workflow for simplified analysis of ATAC-cap-seq data in R
title_fullStr A workflow for simplified analysis of ATAC-cap-seq data in R
title_full_unstemmed A workflow for simplified analysis of ATAC-cap-seq data in R
title_short A workflow for simplified analysis of ATAC-cap-seq data in R
title_sort workflow for simplified analysis of atac-cap-seq data in r
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6047409/
https://www.ncbi.nlm.nih.gov/pubmed/29961827
http://dx.doi.org/10.1093/gigascience/giy080
work_keys_str_mv AT shrestharamkrishna aworkflowforsimplifiedanalysisofataccapseqdatainr
AT dingpingtao aworkflowforsimplifiedanalysisofataccapseqdatainr
AT jonesjonathandg aworkflowforsimplifiedanalysisofataccapseqdatainr
AT macleandan aworkflowforsimplifiedanalysisofataccapseqdatainr
AT shrestharamkrishna workflowforsimplifiedanalysisofataccapseqdatainr
AT dingpingtao workflowforsimplifiedanalysisofataccapseqdatainr
AT jonesjonathandg workflowforsimplifiedanalysisofataccapseqdatainr
AT macleandan workflowforsimplifiedanalysisofataccapseqdatainr