Cargando…

Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs

Comprehensive motif discovery under experimental conditions is critical for the global understanding of gene regulation. To generate a nearly complete list of human DNA motifs under given conditions, we employed a novel approach to de novo discover significant co-occurring DNA motifs in 349 human DN...

Descripción completa

Detalles Bibliográficos
Autores principales: Zheng, Yiyu, Li, Xiaoman, Hu, Haiyan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4288161/
https://www.ncbi.nlm.nih.gov/pubmed/25505144
http://dx.doi.org/10.1093/nar/gku1261
_version_ 1782351919789899776
author Zheng, Yiyu
Li, Xiaoman
Hu, Haiyan
author_facet Zheng, Yiyu
Li, Xiaoman
Hu, Haiyan
author_sort Zheng, Yiyu
collection PubMed
description Comprehensive motif discovery under experimental conditions is critical for the global understanding of gene regulation. To generate a nearly complete list of human DNA motifs under given conditions, we employed a novel approach to de novo discover significant co-occurring DNA motifs in 349 human DNase I hypersensitive site datasets. We predicted 845 to 1325 motifs in each dataset, for a total of 2684 non-redundant motifs. These 2684 motifs contained 54.02 to 75.95% of the known motifs in seven large collections including TRANSFAC. In each dataset, we also discovered 43 663 to 2 013 288 motif modules, groups of motifs with their binding sites co-occurring in a significant number of short DNA regions. Compared with known interacting transcription factors in eight resources, the predicted motif modules on average included 84.23% of known interacting motifs. We further showed new features of the predicted motifs, such as motifs enriched in proximal regions rarely overlapped with motifs enriched in distal regions, motifs enriched in 5′ distal regions were often enriched in 3′ distal regions, etc. Finally, we observed that the 2684 predicted motifs classified the cell or tissue types of the datasets with an accuracy of 81.29%. The resources generated in this study are available at http://server.cs.ucf.edu/predrem/.
format Online
Article
Text
id pubmed-4288161
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-42881612015-02-19 Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs Zheng, Yiyu Li, Xiaoman Hu, Haiyan Nucleic Acids Res Computational Biology Comprehensive motif discovery under experimental conditions is critical for the global understanding of gene regulation. To generate a nearly complete list of human DNA motifs under given conditions, we employed a novel approach to de novo discover significant co-occurring DNA motifs in 349 human DNase I hypersensitive site datasets. We predicted 845 to 1325 motifs in each dataset, for a total of 2684 non-redundant motifs. These 2684 motifs contained 54.02 to 75.95% of the known motifs in seven large collections including TRANSFAC. In each dataset, we also discovered 43 663 to 2 013 288 motif modules, groups of motifs with their binding sites co-occurring in a significant number of short DNA regions. Compared with known interacting transcription factors in eight resources, the predicted motif modules on average included 84.23% of known interacting motifs. We further showed new features of the predicted motifs, such as motifs enriched in proximal regions rarely overlapped with motifs enriched in distal regions, motifs enriched in 5′ distal regions were often enriched in 3′ distal regions, etc. Finally, we observed that the 2684 predicted motifs classified the cell or tissue types of the datasets with an accuracy of 81.29%. The resources generated in this study are available at http://server.cs.ucf.edu/predrem/. Oxford University Press 2015-01-09 2014-12-10 /pmc/articles/PMC4288161/ /pubmed/25505144 http://dx.doi.org/10.1093/nar/gku1261 Text en © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Zheng, Yiyu
Li, Xiaoman
Hu, Haiyan
Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs
title Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs
title_full Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs
title_fullStr Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs
title_full_unstemmed Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs
title_short Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs
title_sort comprehensive discovery of dna motifs in 349 human cells and tissues reveals new features of motifs
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4288161/
https://www.ncbi.nlm.nih.gov/pubmed/25505144
http://dx.doi.org/10.1093/nar/gku1261
work_keys_str_mv AT zhengyiyu comprehensivediscoveryofdnamotifsin349humancellsandtissuesrevealsnewfeaturesofmotifs
AT lixiaoman comprehensivediscoveryofdnamotifsin349humancellsandtissuesrevealsnewfeaturesofmotifs
AT huhaiyan comprehensivediscoveryofdnamotifsin349humancellsandtissuesrevealsnewfeaturesofmotifs