Cargando…

Algorithms for locating extremely conserved elements in multiple sequence alignments

BACKGROUND: In 2004, Bejerano et al. announced the startling discovery of hundreds of "ultraconserved elements", long genomic sequences perfectly conserved across human, mouse, and rat. Their announcement stimulated a flurry of subsequent research. RESULTS: We generalize the notion of ultr...

Descripción completa

Detalles Bibliográficos
Autores principales: Tseng, Huei-Hun E, Tompa, Martin
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2808710/
https://www.ncbi.nlm.nih.gov/pubmed/20021665
http://dx.doi.org/10.1186/1471-2105-10-432
_version_ 1782176526409662464
author Tseng, Huei-Hun E
Tompa, Martin
author_facet Tseng, Huei-Hun E
Tompa, Martin
author_sort Tseng, Huei-Hun E
collection PubMed
description BACKGROUND: In 2004, Bejerano et al. announced the startling discovery of hundreds of "ultraconserved elements", long genomic sequences perfectly conserved across human, mouse, and rat. Their announcement stimulated a flurry of subsequent research. RESULTS: We generalize the notion of ultraconserved element in a natural way from extraordinary human-rodent conservation to extraordinary conservation over an arbitrary set of species. We call these "Extremely Conserved Elements". There is a linear time algorithm to find all such Extremely Conserved Elements in any multiple sequence alignment, provided that the conservation is required to be across all the aligned species. For the general case of conservation across an arbitrary subset of the aligned species, we show that the question of whether there exists an Extremely Conserved Element is NP-complete. We illustrate the linear time algorithm by cataloguing all 177 Extremely Conserved Elements in the currently available 44-vertebrate whole-genome alignment, and point out some of the characteristics of these elements. CONCLUSIONS: The NP-completeness in the case of conservation across an arbitrary subset of the aligned species implies that it is unlikely an efficient algorithm exists for this general case. Despite this fact, for the interesting case of conservation across all or most of the aligned species, our algorithm is efficient enough to be practical. The 177 Extremely Conserved Elements that we catalog demonstrate many of the characteristics of the original ultraconserved elements of Bejerano et al.
format Text
id pubmed-2808710
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28087102010-01-21 Algorithms for locating extremely conserved elements in multiple sequence alignments Tseng, Huei-Hun E Tompa, Martin BMC Bioinformatics Research Article BACKGROUND: In 2004, Bejerano et al. announced the startling discovery of hundreds of "ultraconserved elements", long genomic sequences perfectly conserved across human, mouse, and rat. Their announcement stimulated a flurry of subsequent research. RESULTS: We generalize the notion of ultraconserved element in a natural way from extraordinary human-rodent conservation to extraordinary conservation over an arbitrary set of species. We call these "Extremely Conserved Elements". There is a linear time algorithm to find all such Extremely Conserved Elements in any multiple sequence alignment, provided that the conservation is required to be across all the aligned species. For the general case of conservation across an arbitrary subset of the aligned species, we show that the question of whether there exists an Extremely Conserved Element is NP-complete. We illustrate the linear time algorithm by cataloguing all 177 Extremely Conserved Elements in the currently available 44-vertebrate whole-genome alignment, and point out some of the characteristics of these elements. CONCLUSIONS: The NP-completeness in the case of conservation across an arbitrary subset of the aligned species implies that it is unlikely an efficient algorithm exists for this general case. Despite this fact, for the interesting case of conservation across all or most of the aligned species, our algorithm is efficient enough to be practical. The 177 Extremely Conserved Elements that we catalog demonstrate many of the characteristics of the original ultraconserved elements of Bejerano et al. BioMed Central 2009-12-18 /pmc/articles/PMC2808710/ /pubmed/20021665 http://dx.doi.org/10.1186/1471-2105-10-432 Text en Copyright ©2009 Tseng and Tompa; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Tseng, Huei-Hun E
Tompa, Martin
Algorithms for locating extremely conserved elements in multiple sequence alignments
title Algorithms for locating extremely conserved elements in multiple sequence alignments
title_full Algorithms for locating extremely conserved elements in multiple sequence alignments
title_fullStr Algorithms for locating extremely conserved elements in multiple sequence alignments
title_full_unstemmed Algorithms for locating extremely conserved elements in multiple sequence alignments
title_short Algorithms for locating extremely conserved elements in multiple sequence alignments
title_sort algorithms for locating extremely conserved elements in multiple sequence alignments
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2808710/
https://www.ncbi.nlm.nih.gov/pubmed/20021665
http://dx.doi.org/10.1186/1471-2105-10-432
work_keys_str_mv AT tsenghueihune algorithmsforlocatingextremelyconservedelementsinmultiplesequencealignments
AT tompamartin algorithmsforlocatingextremelyconservedelementsinmultiplesequencealignments