Cargando…

HiC-bench: comprehensive and reproducible Hi-C data analysis designed for parameter exploration and benchmarking

BACKGROUND: Chromatin conformation capture techniques have evolved rapidly over the last few years and have provided new insights into genome organization at an unprecedented resolution. Analysis of Hi-C data is complex and computationally intensive involving multiple tasks and requiring robust qual...

Descripción completa

Detalles Bibliográficos
Autores principales: Lazaris, Charalampos, Kelly, Stephen, Ntziachristos, Panagiotis, Aifantis, Iannis, Tsirigos, Aristotelis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5217551/
https://www.ncbi.nlm.nih.gov/pubmed/28056762
http://dx.doi.org/10.1186/s12864-016-3387-6
_version_ 1782492128468795392
author Lazaris, Charalampos
Kelly, Stephen
Ntziachristos, Panagiotis
Aifantis, Iannis
Tsirigos, Aristotelis
author_facet Lazaris, Charalampos
Kelly, Stephen
Ntziachristos, Panagiotis
Aifantis, Iannis
Tsirigos, Aristotelis
author_sort Lazaris, Charalampos
collection PubMed
description BACKGROUND: Chromatin conformation capture techniques have evolved rapidly over the last few years and have provided new insights into genome organization at an unprecedented resolution. Analysis of Hi-C data is complex and computationally intensive involving multiple tasks and requiring robust quality assessment. This has led to the development of several tools and methods for processing Hi-C data. However, most of the existing tools do not cover all aspects of the analysis and only offer few quality assessment options. Additionally, availability of a multitude of tools makes scientists wonder how these tools and associated parameters can be optimally used, and how potential discrepancies can be interpreted and resolved. Most importantly, investigators need to be ensured that slight changes in parameters and/or methods do not affect the conclusions of their studies. RESULTS: To address these issues (compare, explore and reproduce), we introduce HiC-bench, a configurable computational platform for comprehensive and reproducible analysis of Hi-C sequencing data. HiC-bench performs all common Hi-C analysis tasks, such as alignment, filtering, contact matrix generation and normalization, identification of topological domains, scoring and annotation of specific interactions using both published tools and our own. We have also embedded various tasks that perform quality assessment and visualization. HiC-bench is implemented as a data flow platform with an emphasis on analysis reproducibility. Additionally, the user can readily perform parameter exploration and comparison of different tools in a combinatorial manner that takes into account all desired parameter settings in each pipeline task. This unique feature facilitates the design and execution of complex benchmark studies that may involve combinations of multiple tool/parameter choices in each step of the analysis. To demonstrate the usefulness of our platform, we performed a comprehensive benchmark of existing and new TAD callers exploring different matrix correction methods, parameter settings and sequencing depths. Users can extend our pipeline by adding more tools as they become available. CONCLUSIONS: HiC-bench consists an easy-to-use and extensible platform for comprehensive analysis of Hi-C datasets. We expect that it will facilitate current analyses and help scientists formulate and test new hypotheses in the field of three-dimensional genome organization. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3387-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5217551
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52175512017-01-09 HiC-bench: comprehensive and reproducible Hi-C data analysis designed for parameter exploration and benchmarking Lazaris, Charalampos Kelly, Stephen Ntziachristos, Panagiotis Aifantis, Iannis Tsirigos, Aristotelis BMC Genomics Software BACKGROUND: Chromatin conformation capture techniques have evolved rapidly over the last few years and have provided new insights into genome organization at an unprecedented resolution. Analysis of Hi-C data is complex and computationally intensive involving multiple tasks and requiring robust quality assessment. This has led to the development of several tools and methods for processing Hi-C data. However, most of the existing tools do not cover all aspects of the analysis and only offer few quality assessment options. Additionally, availability of a multitude of tools makes scientists wonder how these tools and associated parameters can be optimally used, and how potential discrepancies can be interpreted and resolved. Most importantly, investigators need to be ensured that slight changes in parameters and/or methods do not affect the conclusions of their studies. RESULTS: To address these issues (compare, explore and reproduce), we introduce HiC-bench, a configurable computational platform for comprehensive and reproducible analysis of Hi-C sequencing data. HiC-bench performs all common Hi-C analysis tasks, such as alignment, filtering, contact matrix generation and normalization, identification of topological domains, scoring and annotation of specific interactions using both published tools and our own. We have also embedded various tasks that perform quality assessment and visualization. HiC-bench is implemented as a data flow platform with an emphasis on analysis reproducibility. Additionally, the user can readily perform parameter exploration and comparison of different tools in a combinatorial manner that takes into account all desired parameter settings in each pipeline task. This unique feature facilitates the design and execution of complex benchmark studies that may involve combinations of multiple tool/parameter choices in each step of the analysis. To demonstrate the usefulness of our platform, we performed a comprehensive benchmark of existing and new TAD callers exploring different matrix correction methods, parameter settings and sequencing depths. Users can extend our pipeline by adding more tools as they become available. CONCLUSIONS: HiC-bench consists an easy-to-use and extensible platform for comprehensive analysis of Hi-C datasets. We expect that it will facilitate current analyses and help scientists formulate and test new hypotheses in the field of three-dimensional genome organization. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3387-6) contains supplementary material, which is available to authorized users. BioMed Central 2017-01-05 /pmc/articles/PMC5217551/ /pubmed/28056762 http://dx.doi.org/10.1186/s12864-016-3387-6 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Lazaris, Charalampos
Kelly, Stephen
Ntziachristos, Panagiotis
Aifantis, Iannis
Tsirigos, Aristotelis
HiC-bench: comprehensive and reproducible Hi-C data analysis designed for parameter exploration and benchmarking
title HiC-bench: comprehensive and reproducible Hi-C data analysis designed for parameter exploration and benchmarking
title_full HiC-bench: comprehensive and reproducible Hi-C data analysis designed for parameter exploration and benchmarking
title_fullStr HiC-bench: comprehensive and reproducible Hi-C data analysis designed for parameter exploration and benchmarking
title_full_unstemmed HiC-bench: comprehensive and reproducible Hi-C data analysis designed for parameter exploration and benchmarking
title_short HiC-bench: comprehensive and reproducible Hi-C data analysis designed for parameter exploration and benchmarking
title_sort hic-bench: comprehensive and reproducible hi-c data analysis designed for parameter exploration and benchmarking
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5217551/
https://www.ncbi.nlm.nih.gov/pubmed/28056762
http://dx.doi.org/10.1186/s12864-016-3387-6
work_keys_str_mv AT lazarischaralampos hicbenchcomprehensiveandreproduciblehicdataanalysisdesignedforparameterexplorationandbenchmarking
AT kellystephen hicbenchcomprehensiveandreproduciblehicdataanalysisdesignedforparameterexplorationandbenchmarking
AT ntziachristospanagiotis hicbenchcomprehensiveandreproduciblehicdataanalysisdesignedforparameterexplorationandbenchmarking
AT aifantisiannis hicbenchcomprehensiveandreproduciblehicdataanalysisdesignedforparameterexplorationandbenchmarking
AT tsirigosaristotelis hicbenchcomprehensiveandreproduciblehicdataanalysisdesignedforparameterexplorationandbenchmarking