Cargando…

Mapping sequences by parts

BACKGROUND: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partit...

Descripción completa

Detalles Bibliográficos
Autores principales: Didier, Gilles, Guziolowski, Carito
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2148040/
https://www.ncbi.nlm.nih.gov/pubmed/17880695
http://dx.doi.org/10.1186/1748-7188-2-11
_version_ 1782144494641086464
author Didier, Gilles
Guziolowski, Carito
author_facet Didier, Gilles
Guziolowski, Carito
author_sort Didier, Gilles
collection PubMed
description BACKGROUND: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores. RESULTS: We introduce an algorithm computing an optimal N-map with time complexity O (|s| × |t| × N) using O (|s| × |t| × N) memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events. PRACTICAL APPLICATION: The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events.
format Text
id pubmed-2148040
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-21480402007-12-20 Mapping sequences by parts Didier, Gilles Guziolowski, Carito Algorithms Mol Biol Research BACKGROUND: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores. RESULTS: We introduce an algorithm computing an optimal N-map with time complexity O (|s| × |t| × N) using O (|s| × |t| × N) memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events. PRACTICAL APPLICATION: The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events. BioMed Central 2007-09-19 /pmc/articles/PMC2148040/ /pubmed/17880695 http://dx.doi.org/10.1186/1748-7188-2-11 Text en Copyright © 2007 Didier and Guziolowski; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Didier, Gilles
Guziolowski, Carito
Mapping sequences by parts
title Mapping sequences by parts
title_full Mapping sequences by parts
title_fullStr Mapping sequences by parts
title_full_unstemmed Mapping sequences by parts
title_short Mapping sequences by parts
title_sort mapping sequences by parts
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2148040/
https://www.ncbi.nlm.nih.gov/pubmed/17880695
http://dx.doi.org/10.1186/1748-7188-2-11
work_keys_str_mv AT didiergilles mappingsequencesbyparts
AT guziolowskicarito mappingsequencesbyparts