Cargando…

BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements

Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved i...

Descripción completa

Detalles Bibliográficos
Autores principales: De Witte, Dieter, Van de Velde, Jan, Decap, Dries, Van Bel, Michiel, Audenaert, Pieter, Demeester, Piet, Dhoedt, Bart, Vandepoele, Klaas, Fostier, Jan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4653392/
https://www.ncbi.nlm.nih.gov/pubmed/26254488
http://dx.doi.org/10.1093/bioinformatics/btv466
_version_ 1782401900953468928
author De Witte, Dieter
Van de Velde, Jan
Decap, Dries
Van Bel, Michiel
Audenaert, Pieter
Demeester, Piet
Dhoedt, Bart
Vandepoele, Klaas
Fostier, Jan
author_facet De Witte, Dieter
Van de Velde, Jan
Decap, Dries
Van Bel, Michiel
Audenaert, Pieter
Demeester, Piet
Dhoedt, Bart
Vandepoele, Klaas
Fostier, Jan
author_sort De Witte, Dieter
collection PubMed
description Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4653392
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-46533922015-11-20 BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements De Witte, Dieter Van de Velde, Jan Decap, Dries Van Bel, Michiel Audenaert, Pieter Demeester, Piet Dhoedt, Bart Vandepoele, Klaas Fostier, Jan Bioinformatics Original Papers Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2015-12-01 2015-08-08 /pmc/articles/PMC4653392/ /pubmed/26254488 http://dx.doi.org/10.1093/bioinformatics/btv466 Text en © The Author 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
De Witte, Dieter
Van de Velde, Jan
Decap, Dries
Van Bel, Michiel
Audenaert, Pieter
Demeester, Piet
Dhoedt, Bart
Vandepoele, Klaas
Fostier, Jan
BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements
title BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements
title_full BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements
title_fullStr BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements
title_full_unstemmed BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements
title_short BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements
title_sort blsspeller: exhaustive comparative discovery of conserved cis-regulatory elements
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4653392/
https://www.ncbi.nlm.nih.gov/pubmed/26254488
http://dx.doi.org/10.1093/bioinformatics/btv466
work_keys_str_mv AT dewittedieter blsspellerexhaustivecomparativediscoveryofconservedcisregulatoryelements
AT vandeveldejan blsspellerexhaustivecomparativediscoveryofconservedcisregulatoryelements
AT decapdries blsspellerexhaustivecomparativediscoveryofconservedcisregulatoryelements
AT vanbelmichiel blsspellerexhaustivecomparativediscoveryofconservedcisregulatoryelements
AT audenaertpieter blsspellerexhaustivecomparativediscoveryofconservedcisregulatoryelements
AT demeesterpiet blsspellerexhaustivecomparativediscoveryofconservedcisregulatoryelements
AT dhoedtbart blsspellerexhaustivecomparativediscoveryofconservedcisregulatoryelements
AT vandepoeleklaas blsspellerexhaustivecomparativediscoveryofconservedcisregulatoryelements
AT fostierjan blsspellerexhaustivecomparativediscoveryofconservedcisregulatoryelements