Cargando…

preciseTAD: a transfer learning framework for 3D domain boundary prediction at base-pair resolution

MOTIVATION: Chromosome conformation capture technologies (Hi-C) revealed extensive DNA folding into discrete 3D domains, such as Topologically Associating Domains and chromatin loops. The correct binding of CTCF and cohesin at domain boundaries is integral in maintaining the proper structure and fun...

Descripción completa

Detalles Bibliográficos
Autores principales: Stilianoudakis, Spiro C, Marshall, Maggie A, Dozmorov, Mikhail G
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8756196/
https://www.ncbi.nlm.nih.gov/pubmed/34741515
http://dx.doi.org/10.1093/bioinformatics/btab743
Descripción
Sumario:MOTIVATION: Chromosome conformation capture technologies (Hi-C) revealed extensive DNA folding into discrete 3D domains, such as Topologically Associating Domains and chromatin loops. The correct binding of CTCF and cohesin at domain boundaries is integral in maintaining the proper structure and function of these 3D domains. 3D domains have been mapped at the resolutions of 1 kilobase and above. However, it has not been possible to define their boundaries at the resolution of boundary-forming proteins. RESULTS: To predict domain boundaries at base-pair resolution, we developed preciseTAD, an optimized transfer learning framework trained on high-resolution genome annotation data. In contrast to current TAD/loop callers, preciseTAD-predicted boundaries are strongly supported by experimental evidence. Importantly, this approach can accurately delineate boundaries in cells without Hi-C data. preciseTAD provides a powerful framework to improve our understanding of how genomic regulators are shaping the 3D structure of the genome at base-pair resolution. AVAILABILITY AND IMPLEMENTATION: preciseTAD is an R/Bioconductor package available at https://bioconductor.org/packages/preciseTAD/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.