Cargando…

TADKB: Family classification and a knowledge base of topologically associating domains

BACKGROUND: Topologically associating domains (TADs) are considered the structural and functional units of the genome. However, there is a lack of an integrated resource for TADs in the literature where researchers can obtain family classifications and detailed information about TADs. RESULTS: We bu...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Tong, Porter, Jacob, Zhao, Chenguang, Zhu, Hao, Wang, Nan, Sun, Zheng, Mo, Yin-Yuan, Wang, Zheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6419456/
https://www.ncbi.nlm.nih.gov/pubmed/30871473
http://dx.doi.org/10.1186/s12864-019-5551-2
_version_ 1783403950201896960
author Liu, Tong
Porter, Jacob
Zhao, Chenguang
Zhu, Hao
Wang, Nan
Sun, Zheng
Mo, Yin-Yuan
Wang, Zheng
author_facet Liu, Tong
Porter, Jacob
Zhao, Chenguang
Zhu, Hao
Wang, Nan
Sun, Zheng
Mo, Yin-Yuan
Wang, Zheng
author_sort Liu, Tong
collection PubMed
description BACKGROUND: Topologically associating domains (TADs) are considered the structural and functional units of the genome. However, there is a lack of an integrated resource for TADs in the literature where researchers can obtain family classifications and detailed information about TADs. RESULTS: We built an online knowledge base TADKB integrating knowledge for TADs in eleven cell types of human and mouse. For each TAD, TADKB provides the predicted three-dimensional (3D) structures of chromosomes and TADs, and detailed annotations about the protein-coding genes and long non-coding RNAs (lncRNAs) existent in each TAD. Besides the 3D chromosomal structures inferred by population Hi-C, the single-cell haplotype-resolved chromosomal 3D structures of 17 GM12878 cells are also integrated in TADKB. A user can submit query gene/lncRNA ID/sequence to search for the TAD(s) that contain(s) the query gene or lncRNA. We also classified TADs into families. To achieve that, we used the TM-scores between reconstructed 3D structures of TADs as structural similarities and the Pearson’s correlation coefficients between the fold enrichment of chromatin states as functional similarities. All of the TADs in one cell type were clustered based on structural and functional similarities respectively using the spectral clustering algorithm with various predefined numbers of clusters. We have compared the overlapping TADs from structural and functional clusters and found that most of the TADs in the functional clusters with depleted chromatin states are clustered into one or two structural clusters. This novel finding indicates a connection between the 3D structures of TADs and their DNA functions in terms of chromatin states. CONCLUSION: TADKB is available at http://dna.cs.miami.edu/TADKB/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5551-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6419456
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64194562019-03-27 TADKB: Family classification and a knowledge base of topologically associating domains Liu, Tong Porter, Jacob Zhao, Chenguang Zhu, Hao Wang, Nan Sun, Zheng Mo, Yin-Yuan Wang, Zheng BMC Genomics Database Article BACKGROUND: Topologically associating domains (TADs) are considered the structural and functional units of the genome. However, there is a lack of an integrated resource for TADs in the literature where researchers can obtain family classifications and detailed information about TADs. RESULTS: We built an online knowledge base TADKB integrating knowledge for TADs in eleven cell types of human and mouse. For each TAD, TADKB provides the predicted three-dimensional (3D) structures of chromosomes and TADs, and detailed annotations about the protein-coding genes and long non-coding RNAs (lncRNAs) existent in each TAD. Besides the 3D chromosomal structures inferred by population Hi-C, the single-cell haplotype-resolved chromosomal 3D structures of 17 GM12878 cells are also integrated in TADKB. A user can submit query gene/lncRNA ID/sequence to search for the TAD(s) that contain(s) the query gene or lncRNA. We also classified TADs into families. To achieve that, we used the TM-scores between reconstructed 3D structures of TADs as structural similarities and the Pearson’s correlation coefficients between the fold enrichment of chromatin states as functional similarities. All of the TADs in one cell type were clustered based on structural and functional similarities respectively using the spectral clustering algorithm with various predefined numbers of clusters. We have compared the overlapping TADs from structural and functional clusters and found that most of the TADs in the functional clusters with depleted chromatin states are clustered into one or two structural clusters. This novel finding indicates a connection between the 3D structures of TADs and their DNA functions in terms of chromatin states. CONCLUSION: TADKB is available at http://dna.cs.miami.edu/TADKB/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5551-2) contains supplementary material, which is available to authorized users. BioMed Central 2019-03-14 /pmc/articles/PMC6419456/ /pubmed/30871473 http://dx.doi.org/10.1186/s12864-019-5551-2 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Database Article
Liu, Tong
Porter, Jacob
Zhao, Chenguang
Zhu, Hao
Wang, Nan
Sun, Zheng
Mo, Yin-Yuan
Wang, Zheng
TADKB: Family classification and a knowledge base of topologically associating domains
title TADKB: Family classification and a knowledge base of topologically associating domains
title_full TADKB: Family classification and a knowledge base of topologically associating domains
title_fullStr TADKB: Family classification and a knowledge base of topologically associating domains
title_full_unstemmed TADKB: Family classification and a knowledge base of topologically associating domains
title_short TADKB: Family classification and a knowledge base of topologically associating domains
title_sort tadkb: family classification and a knowledge base of topologically associating domains
topic Database Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6419456/
https://www.ncbi.nlm.nih.gov/pubmed/30871473
http://dx.doi.org/10.1186/s12864-019-5551-2
work_keys_str_mv AT liutong tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT porterjacob tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT zhaochenguang tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT zhuhao tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT wangnan tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT sunzheng tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT moyinyuan tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT wangzheng tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains