Cargando…

Heuristics for the inversion median problem

BACKGROUND: The study of genome rearrangements has become a mainstay of phylogenetics and comparative genomics. Fundamental in such a study is the median problem: given three genomes find a fourth that minimizes the sum of the evolutionary distances between itself and the given three. Many exact alg...

Descripción completa

Detalles Bibliográficos
Autores principales: Rajan, Vaibhav, Xu, Andrew Wei, Lin, Yu, Swenson, Krister M, Moret, Bernard ME
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009502/
https://www.ncbi.nlm.nih.gov/pubmed/20122203
http://dx.doi.org/10.1186/1471-2105-11-S1-S30
_version_ 1782194693333843968
author Rajan, Vaibhav
Xu, Andrew Wei
Lin, Yu
Swenson, Krister M
Moret, Bernard ME
author_facet Rajan, Vaibhav
Xu, Andrew Wei
Lin, Yu
Swenson, Krister M
Moret, Bernard ME
author_sort Rajan, Vaibhav
collection PubMed
description BACKGROUND: The study of genome rearrangements has become a mainstay of phylogenetics and comparative genomics. Fundamental in such a study is the median problem: given three genomes find a fourth that minimizes the sum of the evolutionary distances between itself and the given three. Many exact algorithms and heuristics have been developed for the inversion median problem, of which the best known is MGR. RESULTS: We present a unifying framework for median heuristics, which enables us to clarify existing strategies and to place them in a partial ordering. Analysis of this framework leads to a new insight: the best strategies continue to refer to the input data rather than reducing the problem to smaller instances. Using this insight, we develop a new heuristic for inversion medians that uses input data to the end of its computation and leverages our previous work with DCJ medians. Finally, we present the results of extensive experimentation showing that our new heuristic outperforms all others in accuracy and, especially, in running time: the heuristic typically returns solutions within 1% of optimal and runs in seconds to minutes even on genomes with 25'000 genes--in contrast, MGR can take days on instances of 200 genes and cannot be used beyond 1'000 genes. CONCLUSION: Finding good rearrangement medians, in particular inversion medians, had long been regarded as the computational bottleneck in whole-genome studies. Our new heuristic for inversion medians, ASM, which dominates all others in our framework, puts that issue to rest by providing near-optimal solutions within seconds to minutes on even the largest genomes.
format Text
id pubmed-3009502
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30095022010-12-23 Heuristics for the inversion median problem Rajan, Vaibhav Xu, Andrew Wei Lin, Yu Swenson, Krister M Moret, Bernard ME BMC Bioinformatics Research BACKGROUND: The study of genome rearrangements has become a mainstay of phylogenetics and comparative genomics. Fundamental in such a study is the median problem: given three genomes find a fourth that minimizes the sum of the evolutionary distances between itself and the given three. Many exact algorithms and heuristics have been developed for the inversion median problem, of which the best known is MGR. RESULTS: We present a unifying framework for median heuristics, which enables us to clarify existing strategies and to place them in a partial ordering. Analysis of this framework leads to a new insight: the best strategies continue to refer to the input data rather than reducing the problem to smaller instances. Using this insight, we develop a new heuristic for inversion medians that uses input data to the end of its computation and leverages our previous work with DCJ medians. Finally, we present the results of extensive experimentation showing that our new heuristic outperforms all others in accuracy and, especially, in running time: the heuristic typically returns solutions within 1% of optimal and runs in seconds to minutes even on genomes with 25'000 genes--in contrast, MGR can take days on instances of 200 genes and cannot be used beyond 1'000 genes. CONCLUSION: Finding good rearrangement medians, in particular inversion medians, had long been regarded as the computational bottleneck in whole-genome studies. Our new heuristic for inversion medians, ASM, which dominates all others in our framework, puts that issue to rest by providing near-optimal solutions within seconds to minutes on even the largest genomes. BioMed Central 2010-01-18 /pmc/articles/PMC3009502/ /pubmed/20122203 http://dx.doi.org/10.1186/1471-2105-11-S1-S30 Text en Copyright ©2010 Rajan et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Rajan, Vaibhav
Xu, Andrew Wei
Lin, Yu
Swenson, Krister M
Moret, Bernard ME
Heuristics for the inversion median problem
title Heuristics for the inversion median problem
title_full Heuristics for the inversion median problem
title_fullStr Heuristics for the inversion median problem
title_full_unstemmed Heuristics for the inversion median problem
title_short Heuristics for the inversion median problem
title_sort heuristics for the inversion median problem
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009502/
https://www.ncbi.nlm.nih.gov/pubmed/20122203
http://dx.doi.org/10.1186/1471-2105-11-S1-S30
work_keys_str_mv AT rajanvaibhav heuristicsfortheinversionmedianproblem
AT xuandrewwei heuristicsfortheinversionmedianproblem
AT linyu heuristicsfortheinversionmedianproblem
AT swensonkristerm heuristicsfortheinversionmedianproblem
AT moretbernardme heuristicsfortheinversionmedianproblem