Cargando…
scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking
Single-cell Hi-C technology provides an unprecedented opportunity to reveal chromatin structure in individual cells. However, high sequencing cost impedes the generation of biological Hi-C data with high sequencing depths and multiple replicates for downstream analysis. Here, we developed a single-c...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10308180/ https://www.ncbi.nlm.nih.gov/pubmed/36708167 http://dx.doi.org/10.1093/jmcb/mjad003 |
_version_ | 1785066191890415616 |
---|---|
author | Fan, Shichen Dang, Dachang Ye, Yusen Zhang, Shao-Wu Gao, Lin Zhang, Shihua |
author_facet | Fan, Shichen Dang, Dachang Ye, Yusen Zhang, Shao-Wu Gao, Lin Zhang, Shihua |
author_sort | Fan, Shichen |
collection | PubMed |
description | Single-cell Hi-C technology provides an unprecedented opportunity to reveal chromatin structure in individual cells. However, high sequencing cost impedes the generation of biological Hi-C data with high sequencing depths and multiple replicates for downstream analysis. Here, we developed a single-cell Hi-C simulator (scHi-CSim) that generates high-fidelity data for benchmarking. scHi-CSim merges neighboring cells to overcome the sparseness of data, samples interactions in distance-stratified chromosomes to maintain the heterogeneity of single cells, and estimates the empirical distribution of restriction fragments to generate simulated data. We demonstrated that scHi-CSim can generate high-fidelity data by comparing the performance of single-cell clustering and detection of chromosomal high-order structures with raw data. Furthermore, scHi-CSim is flexible to change sequencing depth and the number of simulated replicates. We showed that increasing sequencing depth could improve the accuracy of detecting topologically associating domains. We also used scHi-CSim to generate a series of simulated datasets with different sequencing depths to benchmark scHi-C clustering methods. |
format | Online Article Text |
id | pubmed-10308180 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-103081802023-06-30 scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking Fan, Shichen Dang, Dachang Ye, Yusen Zhang, Shao-Wu Gao, Lin Zhang, Shihua J Mol Cell Biol Article Single-cell Hi-C technology provides an unprecedented opportunity to reveal chromatin structure in individual cells. However, high sequencing cost impedes the generation of biological Hi-C data with high sequencing depths and multiple replicates for downstream analysis. Here, we developed a single-cell Hi-C simulator (scHi-CSim) that generates high-fidelity data for benchmarking. scHi-CSim merges neighboring cells to overcome the sparseness of data, samples interactions in distance-stratified chromosomes to maintain the heterogeneity of single cells, and estimates the empirical distribution of restriction fragments to generate simulated data. We demonstrated that scHi-CSim can generate high-fidelity data by comparing the performance of single-cell clustering and detection of chromosomal high-order structures with raw data. Furthermore, scHi-CSim is flexible to change sequencing depth and the number of simulated replicates. We showed that increasing sequencing depth could improve the accuracy of detecting topologically associating domains. We also used scHi-CSim to generate a series of simulated datasets with different sequencing depths to benchmark scHi-C clustering methods. Oxford University Press 2023-01-27 /pmc/articles/PMC10308180/ /pubmed/36708167 http://dx.doi.org/10.1093/jmcb/mjad003 Text en © The Author(s) (2023). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, CEMCS, CAS. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Article Fan, Shichen Dang, Dachang Ye, Yusen Zhang, Shao-Wu Gao, Lin Zhang, Shihua scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking |
title | scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking |
title_full | scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking |
title_fullStr | scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking |
title_full_unstemmed | scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking |
title_short | scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking |
title_sort | schi-csim: a flexible simulator that generates high-fidelity single-cell hi-c data for benchmarking |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10308180/ https://www.ncbi.nlm.nih.gov/pubmed/36708167 http://dx.doi.org/10.1093/jmcb/mjad003 |
work_keys_str_mv | AT fanshichen schicsimaflexiblesimulatorthatgenerateshighfidelitysinglecellhicdataforbenchmarking AT dangdachang schicsimaflexiblesimulatorthatgenerateshighfidelitysinglecellhicdataforbenchmarking AT yeyusen schicsimaflexiblesimulatorthatgenerateshighfidelitysinglecellhicdataforbenchmarking AT zhangshaowu schicsimaflexiblesimulatorthatgenerateshighfidelitysinglecellhicdataforbenchmarking AT gaolin schicsimaflexiblesimulatorthatgenerateshighfidelitysinglecellhicdataforbenchmarking AT zhangshihua schicsimaflexiblesimulatorthatgenerateshighfidelitysinglecellhicdataforbenchmarking |