Cargando…

Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data

Summary: Genome-wide proximity ligation assays, e.g. Hi-C and its variant TCC, have recently become important tools to study spatial genome organization. Removing biases from chromatin contact matrices generated by such techniques is a critical preprocessing step of subsequent analyses. The continui...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Wenyuan, Gong, Ke, Li, Qingjiao, Alber, Frank, Zhou, Xianghong Jasmine
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2015
Materias:	Applications Notes
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4380031/ https://www.ncbi.nlm.nih.gov/pubmed/25391400 http://dx.doi.org/10.1093/bioinformatics/btu747

_version_	1782364278186049536
author	Li, Wenyuan Gong, Ke Li, Qingjiao Alber, Frank Zhou, Xianghong Jasmine
author_facet	Li, Wenyuan Gong, Ke Li, Qingjiao Alber, Frank Zhou, Xianghong Jasmine
author_sort	Li, Wenyuan
collection	PubMed
description	Summary: Genome-wide proximity ligation assays, e.g. Hi-C and its variant TCC, have recently become important tools to study spatial genome organization. Removing biases from chromatin contact matrices generated by such techniques is a critical preprocessing step of subsequent analyses. The continuing decline of sequencing costs has led to an ever-improving resolution of the Hi-C data, resulting in very large matrices of chromatin contacts. Such large-size matrices, however, pose a great challenge on the memory usage and speed of its normalization. Therefore, there is an urgent need for fast and memory-efficient methods for normalization of Hi-C data. We developed Hi-Corrector, an easy-to-use, open source implementation of the Hi-C data normalization algorithm. Its salient features are (i) scalability—the software is capable of normalizing Hi-C data of any size in reasonable times; (ii) memory efficiency—the sequential version can run on any single computer with very limited memory, no matter how little; (iii) fast speed—the parallel version can run very fast on multiple computing nodes with limited local memory. Availability and implementation: The sequential version is implemented in ANSI C and can be easily compiled on any system; the parallel version is implemented in ANSI C with the MPI library (a standardized and portable parallel environment designed for solving large-scale scientific problems). The package is freely available at http://zhoulab.usc.edu/Hi-Corrector/. Contact: alber@usc.edu or xjzhou@usc.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format	Online Article Text
id	pubmed-4380031
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-43800312015-04-15 Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data Li, Wenyuan Gong, Ke Li, Qingjiao Alber, Frank Zhou, Xianghong Jasmine Bioinformatics Applications Notes Summary: Genome-wide proximity ligation assays, e.g. Hi-C and its variant TCC, have recently become important tools to study spatial genome organization. Removing biases from chromatin contact matrices generated by such techniques is a critical preprocessing step of subsequent analyses. The continuing decline of sequencing costs has led to an ever-improving resolution of the Hi-C data, resulting in very large matrices of chromatin contacts. Such large-size matrices, however, pose a great challenge on the memory usage and speed of its normalization. Therefore, there is an urgent need for fast and memory-efficient methods for normalization of Hi-C data. We developed Hi-Corrector, an easy-to-use, open source implementation of the Hi-C data normalization algorithm. Its salient features are (i) scalability—the software is capable of normalizing Hi-C data of any size in reasonable times; (ii) memory efficiency—the sequential version can run on any single computer with very limited memory, no matter how little; (iii) fast speed—the parallel version can run very fast on multiple computing nodes with limited local memory. Availability and implementation: The sequential version is implemented in ANSI C and can be easily compiled on any system; the parallel version is implemented in ANSI C with the MPI library (a standardized and portable parallel environment designed for solving large-scale scientific problems). The package is freely available at http://zhoulab.usc.edu/Hi-Corrector/. Contact: alber@usc.edu or xjzhou@usc.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2015-03-15 2014-11-12 /pmc/articles/PMC4380031/ /pubmed/25391400 http://dx.doi.org/10.1093/bioinformatics/btu747 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Applications Notes Li, Wenyuan Gong, Ke Li, Qingjiao Alber, Frank Zhou, Xianghong Jasmine Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data
title	Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data
title_full	Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data
title_fullStr	Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data
title_full_unstemmed	Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data
title_short	Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data
title_sort	hi-corrector: a fast, scalable and memory-efficient package for normalizing large-scale hi-c data
topic	Applications Notes
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4380031/ https://www.ncbi.nlm.nih.gov/pubmed/25391400 http://dx.doi.org/10.1093/bioinformatics/btu747
work_keys_str_mv	AT liwenyuan hicorrectorafastscalableandmemoryefficientpackagefornormalizinglargescalehicdata AT gongke hicorrectorafastscalableandmemoryefficientpackagefornormalizinglargescalehicdata AT liqingjiao hicorrectorafastscalableandmemoryefficientpackagefornormalizinglargescalehicdata AT alberfrank hicorrectorafastscalableandmemoryefficientpackagefornormalizinglargescalehicdata AT zhouxianghongjasmine hicorrectorafastscalableandmemoryefficientpackagefornormalizinglargescalehicdata

Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data

Ejemplares similares