Cargando…

Assessment of composite motif discovery methods

BACKGROUND: Computational discovery of regulatory elements is an important area of bioinformatics research and more than a hundred motif discovery methods have been published. Traditionally, most of these methods have addressed the problem of single motif discovery – discovering binding motifs for i...

Descripción completa

Detalles Bibliográficos
Autores principales: Klepper, Kjetil, Sandve, Geir K, Abul, Osman, Johansen, Jostein, Drablos, Finn
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2311304/
https://www.ncbi.nlm.nih.gov/pubmed/18302777
http://dx.doi.org/10.1186/1471-2105-9-123
_version_ 1782152559057698816
author Klepper, Kjetil
Sandve, Geir K
Abul, Osman
Johansen, Jostein
Drablos, Finn
author_facet Klepper, Kjetil
Sandve, Geir K
Abul, Osman
Johansen, Jostein
Drablos, Finn
author_sort Klepper, Kjetil
collection PubMed
description BACKGROUND: Computational discovery of regulatory elements is an important area of bioinformatics research and more than a hundred motif discovery methods have been published. Traditionally, most of these methods have addressed the problem of single motif discovery – discovering binding motifs for individual transcription factors. In higher organisms, however, transcription factors usually act in combination with nearby bound factors to induce specific regulatory behaviours. Hence, recent focus has shifted from single motifs to the discovery of sets of motifs bound by multiple cooperating transcription factors, so called composite motifs or cis-regulatory modules. Given the large number and diversity of methods available, independent assessment of methods becomes important. Although there have been several benchmark studies of single motif discovery, no similar studies have previously been conducted concerning composite motif discovery. RESULTS: We have developed a benchmarking framework for composite motif discovery and used it to evaluate the performance of eight published module discovery tools. Benchmark datasets were constructed based on real genomic sequences containing experimentally verified regulatory modules, and the module discovery programs were asked to predict both the locations of these modules and to specify the single motifs involved. To aid the programs in their search, we provided position weight matrices corresponding to the binding motifs of the transcription factors involved. In addition, selections of decoy matrices were mixed with the genuine matrices on one dataset to test the response of programs to varying levels of noise. CONCLUSION: Although some of the methods tested tended to score somewhat better than others overall, there were still large variations between individual datasets and no single method performed consistently better than the rest in all situations. The variation in performance on individual datasets also shows that the new benchmark datasets represents a suitable variety of challenges to most methods for module discovery.
format Text
id pubmed-2311304
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-23113042008-04-16 Assessment of composite motif discovery methods Klepper, Kjetil Sandve, Geir K Abul, Osman Johansen, Jostein Drablos, Finn BMC Bioinformatics Research Article BACKGROUND: Computational discovery of regulatory elements is an important area of bioinformatics research and more than a hundred motif discovery methods have been published. Traditionally, most of these methods have addressed the problem of single motif discovery – discovering binding motifs for individual transcription factors. In higher organisms, however, transcription factors usually act in combination with nearby bound factors to induce specific regulatory behaviours. Hence, recent focus has shifted from single motifs to the discovery of sets of motifs bound by multiple cooperating transcription factors, so called composite motifs or cis-regulatory modules. Given the large number and diversity of methods available, independent assessment of methods becomes important. Although there have been several benchmark studies of single motif discovery, no similar studies have previously been conducted concerning composite motif discovery. RESULTS: We have developed a benchmarking framework for composite motif discovery and used it to evaluate the performance of eight published module discovery tools. Benchmark datasets were constructed based on real genomic sequences containing experimentally verified regulatory modules, and the module discovery programs were asked to predict both the locations of these modules and to specify the single motifs involved. To aid the programs in their search, we provided position weight matrices corresponding to the binding motifs of the transcription factors involved. In addition, selections of decoy matrices were mixed with the genuine matrices on one dataset to test the response of programs to varying levels of noise. CONCLUSION: Although some of the methods tested tended to score somewhat better than others overall, there were still large variations between individual datasets and no single method performed consistently better than the rest in all situations. The variation in performance on individual datasets also shows that the new benchmark datasets represents a suitable variety of challenges to most methods for module discovery. BioMed Central 2008-02-26 /pmc/articles/PMC2311304/ /pubmed/18302777 http://dx.doi.org/10.1186/1471-2105-9-123 Text en Copyright © 2008 Klepper et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Klepper, Kjetil
Sandve, Geir K
Abul, Osman
Johansen, Jostein
Drablos, Finn
Assessment of composite motif discovery methods
title Assessment of composite motif discovery methods
title_full Assessment of composite motif discovery methods
title_fullStr Assessment of composite motif discovery methods
title_full_unstemmed Assessment of composite motif discovery methods
title_short Assessment of composite motif discovery methods
title_sort assessment of composite motif discovery methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2311304/
https://www.ncbi.nlm.nih.gov/pubmed/18302777
http://dx.doi.org/10.1186/1471-2105-9-123
work_keys_str_mv AT klepperkjetil assessmentofcompositemotifdiscoverymethods
AT sandvegeirk assessmentofcompositemotifdiscoverymethods
AT abulosman assessmentofcompositemotifdiscoverymethods
AT johansenjostein assessmentofcompositemotifdiscoverymethods
AT drablosfinn assessmentofcompositemotifdiscoverymethods