Cargando…
Mapping sequences by parts
BACKGROUND: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partit...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2148040/ https://www.ncbi.nlm.nih.gov/pubmed/17880695 http://dx.doi.org/10.1186/1748-7188-2-11 |
_version_ | 1782144494641086464 |
---|---|
author | Didier, Gilles Guziolowski, Carito |
author_facet | Didier, Gilles Guziolowski, Carito |
author_sort | Didier, Gilles |
collection | PubMed |
description | BACKGROUND: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores. RESULTS: We introduce an algorithm computing an optimal N-map with time complexity O (|s| × |t| × N) using O (|s| × |t| × N) memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events. PRACTICAL APPLICATION: The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events. |
format | Text |
id | pubmed-2148040 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-21480402007-12-20 Mapping sequences by parts Didier, Gilles Guziolowski, Carito Algorithms Mol Biol Research BACKGROUND: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores. RESULTS: We introduce an algorithm computing an optimal N-map with time complexity O (|s| × |t| × N) using O (|s| × |t| × N) memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events. PRACTICAL APPLICATION: The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events. BioMed Central 2007-09-19 /pmc/articles/PMC2148040/ /pubmed/17880695 http://dx.doi.org/10.1186/1748-7188-2-11 Text en Copyright © 2007 Didier and Guziolowski; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Didier, Gilles Guziolowski, Carito Mapping sequences by parts |
title | Mapping sequences by parts |
title_full | Mapping sequences by parts |
title_fullStr | Mapping sequences by parts |
title_full_unstemmed | Mapping sequences by parts |
title_short | Mapping sequences by parts |
title_sort | mapping sequences by parts |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2148040/ https://www.ncbi.nlm.nih.gov/pubmed/17880695 http://dx.doi.org/10.1186/1748-7188-2-11 |
work_keys_str_mv | AT didiergilles mappingsequencesbyparts AT guziolowskicarito mappingsequencesbyparts |