Cargando…
Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies
BACKGROUND: Chromosome conformation capture (3C) and Hi-C DNA sequencing methods have rapidly advanced our understanding of the spatial organization of genomes and metagenomes. Many variants of these protocols have been developed, each with their own strengths. Currently there is no systematic means...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5827349/ https://www.ncbi.nlm.nih.gov/pubmed/29149264 http://dx.doi.org/10.1093/gigascience/gix103 |
_version_ | 1783302462291050496 |
---|---|
author | DeMaere, Matthew Z Darling, Aaron E |
author_facet | DeMaere, Matthew Z Darling, Aaron E |
author_sort | DeMaere, Matthew Z |
collection | PubMed |
description | BACKGROUND: Chromosome conformation capture (3C) and Hi-C DNA sequencing methods have rapidly advanced our understanding of the spatial organization of genomes and metagenomes. Many variants of these protocols have been developed, each with their own strengths. Currently there is no systematic means for simulating sequence data from this family of sequencing protocols, potentially hindering the advancement of algorithms to exploit this new datatype. FINDINGS: We describe a computational simulator that, given simple parameters and reference genome sequences, will simulate Hi-C sequencing on those sequences. The simulator models the basic spatial structure in genomes that is commonly observed in Hi-C and 3C datasets, including the distance-decay relationship in proximity ligation, differences in the frequency of interaction within and across chromosomes, and the structure imposed by cells. A means to model the 3D structure of randomly generated topologically associating domains is provided. The simulator considers several sources of error common to 3C and Hi-C library preparation and sequencing methods, including spurious proximity ligation events and sequencing error. CONCLUSIONS: We have introduced the first comprehensive simulator for 3C and Hi-C sequencing protocols. We expect the simulator to have use in testing of Hi-C data analysis algorithms, as well as more general value for experimental design, where questions such as the required depth of sequencing, enzyme choice, and other decisions can be made in advance in order to ensure adequate statistical power with respect to experimental hypothesis testing. |
format | Online Article Text |
id | pubmed-5827349 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-58273492018-03-05 Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies DeMaere, Matthew Z Darling, Aaron E Gigascience Technical Note BACKGROUND: Chromosome conformation capture (3C) and Hi-C DNA sequencing methods have rapidly advanced our understanding of the spatial organization of genomes and metagenomes. Many variants of these protocols have been developed, each with their own strengths. Currently there is no systematic means for simulating sequence data from this family of sequencing protocols, potentially hindering the advancement of algorithms to exploit this new datatype. FINDINGS: We describe a computational simulator that, given simple parameters and reference genome sequences, will simulate Hi-C sequencing on those sequences. The simulator models the basic spatial structure in genomes that is commonly observed in Hi-C and 3C datasets, including the distance-decay relationship in proximity ligation, differences in the frequency of interaction within and across chromosomes, and the structure imposed by cells. A means to model the 3D structure of randomly generated topologically associating domains is provided. The simulator considers several sources of error common to 3C and Hi-C library preparation and sequencing methods, including spurious proximity ligation events and sequencing error. CONCLUSIONS: We have introduced the first comprehensive simulator for 3C and Hi-C sequencing protocols. We expect the simulator to have use in testing of Hi-C data analysis algorithms, as well as more general value for experimental design, where questions such as the required depth of sequencing, enzyme choice, and other decisions can be made in advance in order to ensure adequate statistical power with respect to experimental hypothesis testing. Oxford University Press 2017-11-15 /pmc/articles/PMC5827349/ /pubmed/29149264 http://dx.doi.org/10.1093/gigascience/gix103 Text en © The Authors 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note DeMaere, Matthew Z Darling, Aaron E Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies |
title | Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies |
title_full | Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies |
title_fullStr | Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies |
title_full_unstemmed | Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies |
title_short | Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies |
title_sort | sim3c: simulation of hi-c and meta3c proximity ligation sequencing technologies |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5827349/ https://www.ncbi.nlm.nih.gov/pubmed/29149264 http://dx.doi.org/10.1093/gigascience/gix103 |
work_keys_str_mv | AT demaerematthewz sim3csimulationofhicandmeta3cproximityligationsequencingtechnologies AT darlingaarone sim3csimulationofhicandmeta3cproximityligationsequencingtechnologies |