Cargando…

ChiLin: a comprehensive ChIP-seq and DNase-seq quality control and analysis pipeline

BACKGROUND: Transcription factor binding, histone modification, and chromatin accessibility studies are important approaches to understanding the biology of gene regulation. ChIP-seq and DNase-seq have become the standard techniques for studying protein-DNA interactions and chromatin accessibility r...

Descripción completa

Detalles Bibliográficos
Autores principales: Qin, Qian, Mei, Shenglin, Wu, Qiu, Sun, Hanfei, Li, Lewyn, Taing, Len, Chen, Sujun, Li, Fugen, Liu, Tao, Zang, Chongzhi, Xu, Han, Chen, Yiwen, Meyer, Clifford A., Zhang, Yong, Brown, Myles, Long, Henry W., Liu, X. Shirley
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5048594/
https://www.ncbi.nlm.nih.gov/pubmed/27716038
http://dx.doi.org/10.1186/s12859-016-1274-4
_version_ 1782457600181272576
author Qin, Qian
Mei, Shenglin
Wu, Qiu
Sun, Hanfei
Li, Lewyn
Taing, Len
Chen, Sujun
Li, Fugen
Liu, Tao
Zang, Chongzhi
Xu, Han
Chen, Yiwen
Meyer, Clifford A.
Zhang, Yong
Brown, Myles
Long, Henry W.
Liu, X. Shirley
author_facet Qin, Qian
Mei, Shenglin
Wu, Qiu
Sun, Hanfei
Li, Lewyn
Taing, Len
Chen, Sujun
Li, Fugen
Liu, Tao
Zang, Chongzhi
Xu, Han
Chen, Yiwen
Meyer, Clifford A.
Zhang, Yong
Brown, Myles
Long, Henry W.
Liu, X. Shirley
author_sort Qin, Qian
collection PubMed
description BACKGROUND: Transcription factor binding, histone modification, and chromatin accessibility studies are important approaches to understanding the biology of gene regulation. ChIP-seq and DNase-seq have become the standard techniques for studying protein-DNA interactions and chromatin accessibility respectively, and comprehensive quality control (QC) and analysis tools are critical to extracting the most value from these assay types. Although many analysis and QC tools have been reported, few combine ChIP-seq and DNase-seq data analysis and quality control in a unified framework with a comprehensive and unbiased reference of data quality metrics. RESULTS: ChiLin is a computational pipeline that automates the quality control and data analyses of ChIP-seq and DNase-seq data. It is developed using a flexible and modular software framework that can be easily extended and modified. ChiLin is ideal for batch processing of many datasets and is well suited for large collaborative projects involving ChIP-seq and DNase-seq from different designs. ChiLin generates comprehensive quality control reports that include comparisons with historical data derived from over 23,677 public ChIP-seq and DNase-seq samples (11,265 datasets) from eight literature-based classified categories. To the best of our knowledge, this atlas represents the most comprehensive ChIP-seq and DNase-seq related quality metric resource currently available. These historical metrics provide useful heuristic quality references for experiment across all commonly used assay types. Using representative datasets, we demonstrate the versatility of the pipeline by applying it to different assay types of ChIP-seq data. The pipeline software is available open source at https://github.com/cfce/chilin. CONCLUSION: ChiLin is a scalable and powerful tool to process large batches of ChIP-seq and DNase-seq datasets. The analysis output and quality metrics have been structured into user-friendly directories and reports. We have successfully compiled 23,677 profiles into a comprehensive quality atlas with fine classification for users. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1274-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5048594
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50485942016-10-11 ChiLin: a comprehensive ChIP-seq and DNase-seq quality control and analysis pipeline Qin, Qian Mei, Shenglin Wu, Qiu Sun, Hanfei Li, Lewyn Taing, Len Chen, Sujun Li, Fugen Liu, Tao Zang, Chongzhi Xu, Han Chen, Yiwen Meyer, Clifford A. Zhang, Yong Brown, Myles Long, Henry W. Liu, X. Shirley BMC Bioinformatics Software BACKGROUND: Transcription factor binding, histone modification, and chromatin accessibility studies are important approaches to understanding the biology of gene regulation. ChIP-seq and DNase-seq have become the standard techniques for studying protein-DNA interactions and chromatin accessibility respectively, and comprehensive quality control (QC) and analysis tools are critical to extracting the most value from these assay types. Although many analysis and QC tools have been reported, few combine ChIP-seq and DNase-seq data analysis and quality control in a unified framework with a comprehensive and unbiased reference of data quality metrics. RESULTS: ChiLin is a computational pipeline that automates the quality control and data analyses of ChIP-seq and DNase-seq data. It is developed using a flexible and modular software framework that can be easily extended and modified. ChiLin is ideal for batch processing of many datasets and is well suited for large collaborative projects involving ChIP-seq and DNase-seq from different designs. ChiLin generates comprehensive quality control reports that include comparisons with historical data derived from over 23,677 public ChIP-seq and DNase-seq samples (11,265 datasets) from eight literature-based classified categories. To the best of our knowledge, this atlas represents the most comprehensive ChIP-seq and DNase-seq related quality metric resource currently available. These historical metrics provide useful heuristic quality references for experiment across all commonly used assay types. Using representative datasets, we demonstrate the versatility of the pipeline by applying it to different assay types of ChIP-seq data. The pipeline software is available open source at https://github.com/cfce/chilin. CONCLUSION: ChiLin is a scalable and powerful tool to process large batches of ChIP-seq and DNase-seq datasets. The analysis output and quality metrics have been structured into user-friendly directories and reports. We have successfully compiled 23,677 profiles into a comprehensive quality atlas with fine classification for users. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1274-4) contains supplementary material, which is available to authorized users. BioMed Central 2016-10-03 /pmc/articles/PMC5048594/ /pubmed/27716038 http://dx.doi.org/10.1186/s12859-016-1274-4 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Qin, Qian
Mei, Shenglin
Wu, Qiu
Sun, Hanfei
Li, Lewyn
Taing, Len
Chen, Sujun
Li, Fugen
Liu, Tao
Zang, Chongzhi
Xu, Han
Chen, Yiwen
Meyer, Clifford A.
Zhang, Yong
Brown, Myles
Long, Henry W.
Liu, X. Shirley
ChiLin: a comprehensive ChIP-seq and DNase-seq quality control and analysis pipeline
title ChiLin: a comprehensive ChIP-seq and DNase-seq quality control and analysis pipeline
title_full ChiLin: a comprehensive ChIP-seq and DNase-seq quality control and analysis pipeline
title_fullStr ChiLin: a comprehensive ChIP-seq and DNase-seq quality control and analysis pipeline
title_full_unstemmed ChiLin: a comprehensive ChIP-seq and DNase-seq quality control and analysis pipeline
title_short ChiLin: a comprehensive ChIP-seq and DNase-seq quality control and analysis pipeline
title_sort chilin: a comprehensive chip-seq and dnase-seq quality control and analysis pipeline
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5048594/
https://www.ncbi.nlm.nih.gov/pubmed/27716038
http://dx.doi.org/10.1186/s12859-016-1274-4
work_keys_str_mv AT qinqian chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT meishenglin chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT wuqiu chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT sunhanfei chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT lilewyn chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT tainglen chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT chensujun chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT lifugen chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT liutao chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT zangchongzhi chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT xuhan chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT chenyiwen chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT meyerclifforda chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT zhangyong chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT brownmyles chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT longhenryw chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline
AT liuxshirley chilinacomprehensivechipseqanddnaseseqqualitycontrolandanalysispipeline