Cargando…

HiCRep.py: fast comparison of Hi-C contact matrices in Python

MOTIVATION: Hi-C is the most widely used assay for investigating genome-wide 3D organization of chromatin. When working with Hi-C data, it is often useful to calculate the similarity between contact matrices in order to assess experimental reproducibility or to quantify relationships among Hi-C data...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Dejun, Sanders, Justin, Noble, William Stafford
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8479650/
https://www.ncbi.nlm.nih.gov/pubmed/33576390
http://dx.doi.org/10.1093/bioinformatics/btab097
Descripción
Sumario:MOTIVATION: Hi-C is the most widely used assay for investigating genome-wide 3D organization of chromatin. When working with Hi-C data, it is often useful to calculate the similarity between contact matrices in order to assess experimental reproducibility or to quantify relationships among Hi-C data from related samples. The HiCRep algorithm has been widely adopted for this task, but the existing R implementation suffers from run time limitations on high-resolution Hi-C data or on large single-cell Hi-C datasets. RESULTS: We introduce a Python implementation of HiCRep and demonstrate that it is much faster and consumes much less memory than the existing R implementation. Furthermore, we give examples of HiCRep’s ability to accurately distinguish replicates from non-replicates and to reveal cell type structure among collections of Hi-C data. AVAILABILITY AND IMPLEMENTATION: HiCRep.py and its documentation are available with a GPL license at https://github.com/Noble-Lab/hicrep. The software may be installed automatically using the pip package installer. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.