Cargando…
On the rank-distance median of 3 permutations
BACKGROUND: Recently, Pereira Zanetti, Biller and Meidanis have proposed a new definition of a rearrangement distance between genomes. In this formulation, each genome is represented as a matrix, and the distance d is the rank distance between these matrices. Although defined in terms of matrices, t...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5998913/ https://www.ncbi.nlm.nih.gov/pubmed/29745865 http://dx.doi.org/10.1186/s12859-018-2131-4 |
Sumario: | BACKGROUND: Recently, Pereira Zanetti, Biller and Meidanis have proposed a new definition of a rearrangement distance between genomes. In this formulation, each genome is represented as a matrix, and the distance d is the rank distance between these matrices. Although defined in terms of matrices, the rank distance is equal to the minimum total weight of a series of weighted operations that leads from one genome to the other, including inversions, translocations, transpositions, and others. The computational complexity of the median-of-three problem according to this distance is currently unknown. The genome matrices are a special kind of permutation matrices, which we study in this paper. In their paper, the authors provide an [Formula: see text] algorithm for determining three candidate medians, prove the tight approximation ratio [Formula: see text] , and provide a sufficient condition for their candidates to be true medians. They also conduct some experiments that suggest that their method is accurate on simulated and real data. RESULTS: Three invariants characterizing the problem of finding the median of 3 matrices. A sufficient condition for uniqueness of medians that can be checked in O(n). A faster, [Formula: see text] algorithm for determining the median under this condition. A new heuristic algorithm for this problem based on compressed sensing. A [Formula: see text] algorithm that exactly solves the problem when the inputs are orthogonal matrices, a class that includes both permutations and genomes as special cases. CONCLUSIONS: Our work provides the first proof that, with respect to the rank distance, the problem of finding the median of 3 genomes, as well as the median of 3 permutations, is exactly solvable in polynomial time, a result which should be contrasted with its NP-hardness for the DCJ (double cut-and-join) distance and most other families of genome rearrangement operations. This result, backed by our experimental tests, indicates that the rank distance is a viable alternative to the DCJ distance widely used in genome comparisons. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2131-4) contains supplementary material, which is available to authorized users. |
---|