Cargando…

SCSIM: Jointly simulating correlated single-cell and bulk next-generation DNA sequencing data

BACKGROUND: Recently, it has become possible to collect next-generation DNA sequencing data sets that are composed of multiple samples from multiple biological units where each of these samples may be from a single cell or bulk tissue. Yet, there does not yet exist a tool for simulating DNA sequenci...

Descripción completa

Detalles Bibliográficos
Autores principales: Giguere, Collin, Dubey, Harsh Vardhan, Sarsani, Vishal Kumar, Saddiki, Hachem, He, Shai, Flaherty, Patrick
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7249349/
https://www.ncbi.nlm.nih.gov/pubmed/32456609
http://dx.doi.org/10.1186/s12859-020-03550-1
Descripción
Sumario:BACKGROUND: Recently, it has become possible to collect next-generation DNA sequencing data sets that are composed of multiple samples from multiple biological units where each of these samples may be from a single cell or bulk tissue. Yet, there does not yet exist a tool for simulating DNA sequencing data from such a nested sampling arrangement with single-cell and bulk samples so that developers of analysis methods can assess accuracy and precision. RESULTS: We have developed a tool that simulates DNA sequencing data from hierarchically grouped (correlated) samples where each sample is designated bulk or single-cell. Our tool uses a simple configuration file to define the experimental arrangement and can be integrated into software pipelines for testing of variant callers or other genomic tools. CONCLUSIONS: The DNA sequencing data generated by our simulator is representative of real data and integrates seamlessly with standard downstream analysis tools.