Cargando…
Generative modeling of multi-mapping reads with mHi-C advances analysis of Hi-C studies
Current Hi-C analysis approaches are unable to account for reads that align to multiple locations, and hence underestimate biological signal from repetitive regions of genomes. We developed and validated mHi-C, a multi-read mapping strategy to probabilistically allocate Hi-C multi-reads. mHi-C exhib...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
eLife Sciences Publications, Ltd
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6450682/ https://www.ncbi.nlm.nih.gov/pubmed/30702424 http://dx.doi.org/10.7554/eLife.38070 |
_version_ | 1783409062771163136 |
---|---|
author | Zheng, Ye Ay, Ferhat Keles, Sunduz |
author_facet | Zheng, Ye Ay, Ferhat Keles, Sunduz |
author_sort | Zheng, Ye |
collection | PubMed |
description | Current Hi-C analysis approaches are unable to account for reads that align to multiple locations, and hence underestimate biological signal from repetitive regions of genomes. We developed and validated mHi-C, a multi-read mapping strategy to probabilistically allocate Hi-C multi-reads. mHi-C exhibited superior performance over utilizing only uni-reads and heuristic approaches aimed at rescuing multi-reads on benchmarks. Specifically, mHi-C increased the sequencing depth by an average of 20% resulting in higher reproducibility of contact matrices and detected interactions across biological replicates. The impact of the multi-reads on the detection of significant interactions is influenced marginally by the relative contribution of multi-reads to the sequencing depth compared to uni-reads, cis-to-trans ratio of contacts, and the broad data quality as reflected by the proportion of mappable reads of datasets. Computational experiments highlighted that in Hi-C studies with short read lengths, mHi-C rescued multi-reads can emulate the effect of longer reads. mHi-C also revealed biologically supported bona fide promoter-enhancer interactions and topologically associating domains involving repetitive genomic regions, thereby unlocking a previously masked portion of the genome for conformation capture studies. |
format | Online Article Text |
id | pubmed-6450682 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | eLife Sciences Publications, Ltd |
record_format | MEDLINE/PubMed |
spelling | pubmed-64506822019-04-08 Generative modeling of multi-mapping reads with mHi-C advances analysis of Hi-C studies Zheng, Ye Ay, Ferhat Keles, Sunduz eLife Computational and Systems Biology Current Hi-C analysis approaches are unable to account for reads that align to multiple locations, and hence underestimate biological signal from repetitive regions of genomes. We developed and validated mHi-C, a multi-read mapping strategy to probabilistically allocate Hi-C multi-reads. mHi-C exhibited superior performance over utilizing only uni-reads and heuristic approaches aimed at rescuing multi-reads on benchmarks. Specifically, mHi-C increased the sequencing depth by an average of 20% resulting in higher reproducibility of contact matrices and detected interactions across biological replicates. The impact of the multi-reads on the detection of significant interactions is influenced marginally by the relative contribution of multi-reads to the sequencing depth compared to uni-reads, cis-to-trans ratio of contacts, and the broad data quality as reflected by the proportion of mappable reads of datasets. Computational experiments highlighted that in Hi-C studies with short read lengths, mHi-C rescued multi-reads can emulate the effect of longer reads. mHi-C also revealed biologically supported bona fide promoter-enhancer interactions and topologically associating domains involving repetitive genomic regions, thereby unlocking a previously masked portion of the genome for conformation capture studies. eLife Sciences Publications, Ltd 2019-01-31 /pmc/articles/PMC6450682/ /pubmed/30702424 http://dx.doi.org/10.7554/eLife.38070 Text en © 2019, Zheng et al http://creativecommons.org/licenses/by/4.0/ http://creativecommons.org/licenses/by/4.0/This article is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use and redistribution provided that the original author and source are credited. |
spellingShingle | Computational and Systems Biology Zheng, Ye Ay, Ferhat Keles, Sunduz Generative modeling of multi-mapping reads with mHi-C advances analysis of Hi-C studies |
title | Generative modeling of multi-mapping reads with mHi-C advances analysis of Hi-C studies |
title_full | Generative modeling of multi-mapping reads with mHi-C advances analysis of Hi-C studies |
title_fullStr | Generative modeling of multi-mapping reads with mHi-C advances analysis of Hi-C studies |
title_full_unstemmed | Generative modeling of multi-mapping reads with mHi-C advances analysis of Hi-C studies |
title_short | Generative modeling of multi-mapping reads with mHi-C advances analysis of Hi-C studies |
title_sort | generative modeling of multi-mapping reads with mhi-c advances analysis of hi-c studies |
topic | Computational and Systems Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6450682/ https://www.ncbi.nlm.nih.gov/pubmed/30702424 http://dx.doi.org/10.7554/eLife.38070 |
work_keys_str_mv | AT zhengye generativemodelingofmultimappingreadswithmhicadvancesanalysisofhicstudies AT ayferhat generativemodelingofmultimappingreadswithmhicadvancesanalysisofhicstudies AT kelessunduz generativemodelingofmultimappingreadswithmhicadvancesanalysisofhicstudies |