Cargando…
SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering
BACKGROUND: The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops. I...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7372752/ https://www.ncbi.nlm.nih.gov/pubmed/32689928 http://dx.doi.org/10.1186/s12859-020-03652-w |
_version_ | 1783561373689577472 |
---|---|
author | Cresswell, Kellen G. Stansfield, John C. Dozmorov, Mikhail G. |
author_facet | Cresswell, Kellen G. Stansfield, John C. Dozmorov, Mikhail G. |
author_sort | Cresswell, Kellen G. |
collection | PubMed |
description | BACKGROUND: The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops. Identifying such hierarchical structures is a critical step in understanding genome regulation. Existing tools for TAD calling are frequently sensitive to biases in Hi-C data, depend on tunable parameters, and are computationally inefficient. METHODS: To address these challenges, we developed a novel sliding window-based spectral clustering framework that uses gaps between consecutive eigenvectors for TAD boundary identification. RESULTS: Our method, implemented in an R package, SpectralTAD, detects hierarchical, biologically relevant TADs, has automatic parameter selection, is robust to sequencing depth, resolution, and sparsity of Hi-C data. SpectralTAD outperforms four state-of-the-art TAD callers in simulated and experimental settings. We demonstrate that TAD boundaries shared among multiple levels of the TAD hierarchy were more enriched in classical boundary marks and more conserved across cell lines and tissues. In contrast, boundaries of TADs that cannot be split into sub-TADs showed less enrichment and conservation, suggesting their more dynamic role in genome regulation. CONCLUSION: SpectralTAD is available on Bioconductor, http://bioconductor.org/packages/SpectralTAD/. |
format | Online Article Text |
id | pubmed-7372752 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-73727522020-07-21 SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering Cresswell, Kellen G. Stansfield, John C. Dozmorov, Mikhail G. BMC Bioinformatics Software BACKGROUND: The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops. Identifying such hierarchical structures is a critical step in understanding genome regulation. Existing tools for TAD calling are frequently sensitive to biases in Hi-C data, depend on tunable parameters, and are computationally inefficient. METHODS: To address these challenges, we developed a novel sliding window-based spectral clustering framework that uses gaps between consecutive eigenvectors for TAD boundary identification. RESULTS: Our method, implemented in an R package, SpectralTAD, detects hierarchical, biologically relevant TADs, has automatic parameter selection, is robust to sequencing depth, resolution, and sparsity of Hi-C data. SpectralTAD outperforms four state-of-the-art TAD callers in simulated and experimental settings. We demonstrate that TAD boundaries shared among multiple levels of the TAD hierarchy were more enriched in classical boundary marks and more conserved across cell lines and tissues. In contrast, boundaries of TADs that cannot be split into sub-TADs showed less enrichment and conservation, suggesting their more dynamic role in genome regulation. CONCLUSION: SpectralTAD is available on Bioconductor, http://bioconductor.org/packages/SpectralTAD/. BioMed Central 2020-07-20 /pmc/articles/PMC7372752/ /pubmed/32689928 http://dx.doi.org/10.1186/s12859-020-03652-w Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Software Cresswell, Kellen G. Stansfield, John C. Dozmorov, Mikhail G. SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering |
title | SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering |
title_full | SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering |
title_fullStr | SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering |
title_full_unstemmed | SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering |
title_short | SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering |
title_sort | spectraltad: an r package for defining a hierarchy of topologically associated domains using spectral clustering |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7372752/ https://www.ncbi.nlm.nih.gov/pubmed/32689928 http://dx.doi.org/10.1186/s12859-020-03652-w |
work_keys_str_mv | AT cresswellkelleng spectraltadanrpackagefordefiningahierarchyoftopologicallyassociateddomainsusingspectralclustering AT stansfieldjohnc spectraltadanrpackagefordefiningahierarchyoftopologicallyassociateddomainsusingspectralclustering AT dozmorovmikhailg spectraltadanrpackagefordefiningahierarchyoftopologicallyassociateddomainsusingspectralclustering |