Cargando…

scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking

Single-cell Hi-C technology provides an unprecedented opportunity to reveal chromatin structure in individual cells. However, high sequencing cost impedes the generation of biological Hi-C data with high sequencing depths and multiple replicates for downstream analysis. Here, we developed a single-c...

Descripción completa

Detalles Bibliográficos
Autores principales: Fan, Shichen, Dang, Dachang, Ye, Yusen, Zhang, Shao-Wu, Gao, Lin, Zhang, Shihua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10308180/
https://www.ncbi.nlm.nih.gov/pubmed/36708167
http://dx.doi.org/10.1093/jmcb/mjad003
_version_ 1785066191890415616
author Fan, Shichen
Dang, Dachang
Ye, Yusen
Zhang, Shao-Wu
Gao, Lin
Zhang, Shihua
author_facet Fan, Shichen
Dang, Dachang
Ye, Yusen
Zhang, Shao-Wu
Gao, Lin
Zhang, Shihua
author_sort Fan, Shichen
collection PubMed
description Single-cell Hi-C technology provides an unprecedented opportunity to reveal chromatin structure in individual cells. However, high sequencing cost impedes the generation of biological Hi-C data with high sequencing depths and multiple replicates for downstream analysis. Here, we developed a single-cell Hi-C simulator (scHi-CSim) that generates high-fidelity data for benchmarking. scHi-CSim merges neighboring cells to overcome the sparseness of data, samples interactions in distance-stratified chromosomes to maintain the heterogeneity of single cells, and estimates the empirical distribution of restriction fragments to generate simulated data. We demonstrated that scHi-CSim can generate high-fidelity data by comparing the performance of single-cell clustering and detection of chromosomal high-order structures with raw data. Furthermore, scHi-CSim is flexible to change sequencing depth and the number of simulated replicates. We showed that increasing sequencing depth could improve the accuracy of detecting topologically associating domains. We also used scHi-CSim to generate a series of simulated datasets with different sequencing depths to benchmark scHi-C clustering methods.
format Online
Article
Text
id pubmed-10308180
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-103081802023-06-30 scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking Fan, Shichen Dang, Dachang Ye, Yusen Zhang, Shao-Wu Gao, Lin Zhang, Shihua J Mol Cell Biol Article Single-cell Hi-C technology provides an unprecedented opportunity to reveal chromatin structure in individual cells. However, high sequencing cost impedes the generation of biological Hi-C data with high sequencing depths and multiple replicates for downstream analysis. Here, we developed a single-cell Hi-C simulator (scHi-CSim) that generates high-fidelity data for benchmarking. scHi-CSim merges neighboring cells to overcome the sparseness of data, samples interactions in distance-stratified chromosomes to maintain the heterogeneity of single cells, and estimates the empirical distribution of restriction fragments to generate simulated data. We demonstrated that scHi-CSim can generate high-fidelity data by comparing the performance of single-cell clustering and detection of chromosomal high-order structures with raw data. Furthermore, scHi-CSim is flexible to change sequencing depth and the number of simulated replicates. We showed that increasing sequencing depth could improve the accuracy of detecting topologically associating domains. We also used scHi-CSim to generate a series of simulated datasets with different sequencing depths to benchmark scHi-C clustering methods. Oxford University Press 2023-01-27 /pmc/articles/PMC10308180/ /pubmed/36708167 http://dx.doi.org/10.1093/jmcb/mjad003 Text en © The Author(s) (2023). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, CEMCS, CAS. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Article
Fan, Shichen
Dang, Dachang
Ye, Yusen
Zhang, Shao-Wu
Gao, Lin
Zhang, Shihua
scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking
title scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking
title_full scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking
title_fullStr scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking
title_full_unstemmed scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking
title_short scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking
title_sort schi-csim: a flexible simulator that generates high-fidelity single-cell hi-c data for benchmarking
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10308180/
https://www.ncbi.nlm.nih.gov/pubmed/36708167
http://dx.doi.org/10.1093/jmcb/mjad003
work_keys_str_mv AT fanshichen schicsimaflexiblesimulatorthatgenerateshighfidelitysinglecellhicdataforbenchmarking
AT dangdachang schicsimaflexiblesimulatorthatgenerateshighfidelitysinglecellhicdataforbenchmarking
AT yeyusen schicsimaflexiblesimulatorthatgenerateshighfidelitysinglecellhicdataforbenchmarking
AT zhangshaowu schicsimaflexiblesimulatorthatgenerateshighfidelitysinglecellhicdataforbenchmarking
AT gaolin schicsimaflexiblesimulatorthatgenerateshighfidelitysinglecellhicdataforbenchmarking
AT zhangshihua schicsimaflexiblesimulatorthatgenerateshighfidelitysinglecellhicdataforbenchmarking