Cargando…
A general approach for discriminative de novo motif discovery from high-throughput data
De novo motif discovery has been an important challenge of bioinformatics for the past two decades. Since the emergence of high-throughput techniques like ChIP-seq, ChIP-exo and protein-binding microarrays (PBMs), the focus of de novo motif discovery has shifted to runtime and accuracy on large data...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3834837/ https://www.ncbi.nlm.nih.gov/pubmed/24057214 http://dx.doi.org/10.1093/nar/gkt831 |
_version_ | 1782292053642706944 |
---|---|
author | Grau, Jan Posch, Stefan Grosse, Ivo Keilwagen, Jens |
author_facet | Grau, Jan Posch, Stefan Grosse, Ivo Keilwagen, Jens |
author_sort | Grau, Jan |
collection | PubMed |
description | De novo motif discovery has been an important challenge of bioinformatics for the past two decades. Since the emergence of high-throughput techniques like ChIP-seq, ChIP-exo and protein-binding microarrays (PBMs), the focus of de novo motif discovery has shifted to runtime and accuracy on large data sets. For this purpose, specialized algorithms have been designed for discovering motifs in ChIP-seq or PBM data. However, none of the existing approaches work perfectly for all three high-throughput techniques. In this article, we propose Dimont, a general approach for fast and accurate de novo motif discovery from high-throughput data. We demonstrate that Dimont yields a higher number of correct motifs from ChIP-seq data than any of the specialized approaches and achieves a higher accuracy for predicting PBM intensities from probe sequence than any of the approaches specifically designed for that purpose. Dimont also reports the expected motifs for several ChIP-exo data sets. Investigating differences between in vitro and in vivo binding, we find that for most transcription factors, the motifs discovered by Dimont are in good accordance between techniques, but we also find notable exceptions. We also observe that modeling intra-motif dependencies may increase accuracy, which indicates that more complex motif models are a worthwhile field of research. |
format | Online Article Text |
id | pubmed-3834837 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-38348372013-11-21 A general approach for discriminative de novo motif discovery from high-throughput data Grau, Jan Posch, Stefan Grosse, Ivo Keilwagen, Jens Nucleic Acids Res Methods Online De novo motif discovery has been an important challenge of bioinformatics for the past two decades. Since the emergence of high-throughput techniques like ChIP-seq, ChIP-exo and protein-binding microarrays (PBMs), the focus of de novo motif discovery has shifted to runtime and accuracy on large data sets. For this purpose, specialized algorithms have been designed for discovering motifs in ChIP-seq or PBM data. However, none of the existing approaches work perfectly for all three high-throughput techniques. In this article, we propose Dimont, a general approach for fast and accurate de novo motif discovery from high-throughput data. We demonstrate that Dimont yields a higher number of correct motifs from ChIP-seq data than any of the specialized approaches and achieves a higher accuracy for predicting PBM intensities from probe sequence than any of the approaches specifically designed for that purpose. Dimont also reports the expected motifs for several ChIP-exo data sets. Investigating differences between in vitro and in vivo binding, we find that for most transcription factors, the motifs discovered by Dimont are in good accordance between techniques, but we also find notable exceptions. We also observe that modeling intra-motif dependencies may increase accuracy, which indicates that more complex motif models are a worthwhile field of research. Oxford University Press 2013-11 2013-09-19 /pmc/articles/PMC3834837/ /pubmed/24057214 http://dx.doi.org/10.1093/nar/gkt831 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Grau, Jan Posch, Stefan Grosse, Ivo Keilwagen, Jens A general approach for discriminative de novo motif discovery from high-throughput data |
title | A general approach for discriminative de novo motif discovery from high-throughput data |
title_full | A general approach for discriminative de novo motif discovery from high-throughput data |
title_fullStr | A general approach for discriminative de novo motif discovery from high-throughput data |
title_full_unstemmed | A general approach for discriminative de novo motif discovery from high-throughput data |
title_short | A general approach for discriminative de novo motif discovery from high-throughput data |
title_sort | general approach for discriminative de novo motif discovery from high-throughput data |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3834837/ https://www.ncbi.nlm.nih.gov/pubmed/24057214 http://dx.doi.org/10.1093/nar/gkt831 |
work_keys_str_mv | AT graujan ageneralapproachfordiscriminativedenovomotifdiscoveryfromhighthroughputdata AT poschstefan ageneralapproachfordiscriminativedenovomotifdiscoveryfromhighthroughputdata AT grosseivo ageneralapproachfordiscriminativedenovomotifdiscoveryfromhighthroughputdata AT keilwagenjens ageneralapproachfordiscriminativedenovomotifdiscoveryfromhighthroughputdata AT graujan generalapproachfordiscriminativedenovomotifdiscoveryfromhighthroughputdata AT poschstefan generalapproachfordiscriminativedenovomotifdiscoveryfromhighthroughputdata AT grosseivo generalapproachfordiscriminativedenovomotifdiscoveryfromhighthroughputdata AT keilwagenjens generalapproachfordiscriminativedenovomotifdiscoveryfromhighthroughputdata |