Cargando…

On the rank-distance median of 3 permutations

BACKGROUND: Recently, Pereira Zanetti, Biller and Meidanis have proposed a new definition of a rearrangement distance between genomes. In this formulation, each genome is represented as a matrix, and the distance d is the rank distance between these matrices. Although defined in terms of matrices, t...

Descripción completa

Detalles Bibliográficos
Autores principales: Chindelevitch, Leonid, Pereira Zanetti, João Paulo, Meidanis, João
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5998913/
https://www.ncbi.nlm.nih.gov/pubmed/29745865
http://dx.doi.org/10.1186/s12859-018-2131-4
_version_ 1783331329586233344
author Chindelevitch, Leonid
Pereira Zanetti, João Paulo
Meidanis, João
author_facet Chindelevitch, Leonid
Pereira Zanetti, João Paulo
Meidanis, João
author_sort Chindelevitch, Leonid
collection PubMed
description BACKGROUND: Recently, Pereira Zanetti, Biller and Meidanis have proposed a new definition of a rearrangement distance between genomes. In this formulation, each genome is represented as a matrix, and the distance d is the rank distance between these matrices. Although defined in terms of matrices, the rank distance is equal to the minimum total weight of a series of weighted operations that leads from one genome to the other, including inversions, translocations, transpositions, and others. The computational complexity of the median-of-three problem according to this distance is currently unknown. The genome matrices are a special kind of permutation matrices, which we study in this paper. In their paper, the authors provide an [Formula: see text] algorithm for determining three candidate medians, prove the tight approximation ratio [Formula: see text] , and provide a sufficient condition for their candidates to be true medians. They also conduct some experiments that suggest that their method is accurate on simulated and real data. RESULTS: Three invariants characterizing the problem of finding the median of 3 matrices. A sufficient condition for uniqueness of medians that can be checked in O(n). A faster, [Formula: see text] algorithm for determining the median under this condition. A new heuristic algorithm for this problem based on compressed sensing. A [Formula: see text] algorithm that exactly solves the problem when the inputs are orthogonal matrices, a class that includes both permutations and genomes as special cases. CONCLUSIONS: Our work provides the first proof that, with respect to the rank distance, the problem of finding the median of 3 genomes, as well as the median of 3 permutations, is exactly solvable in polynomial time, a result which should be contrasted with its NP-hardness for the DCJ (double cut-and-join) distance and most other families of genome rearrangement operations. This result, backed by our experimental tests, indicates that the rank distance is a viable alternative to the DCJ distance widely used in genome comparisons. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2131-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5998913
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-59989132018-06-25 On the rank-distance median of 3 permutations Chindelevitch, Leonid Pereira Zanetti, João Paulo Meidanis, João BMC Bioinformatics Research BACKGROUND: Recently, Pereira Zanetti, Biller and Meidanis have proposed a new definition of a rearrangement distance between genomes. In this formulation, each genome is represented as a matrix, and the distance d is the rank distance between these matrices. Although defined in terms of matrices, the rank distance is equal to the minimum total weight of a series of weighted operations that leads from one genome to the other, including inversions, translocations, transpositions, and others. The computational complexity of the median-of-three problem according to this distance is currently unknown. The genome matrices are a special kind of permutation matrices, which we study in this paper. In their paper, the authors provide an [Formula: see text] algorithm for determining three candidate medians, prove the tight approximation ratio [Formula: see text] , and provide a sufficient condition for their candidates to be true medians. They also conduct some experiments that suggest that their method is accurate on simulated and real data. RESULTS: Three invariants characterizing the problem of finding the median of 3 matrices. A sufficient condition for uniqueness of medians that can be checked in O(n). A faster, [Formula: see text] algorithm for determining the median under this condition. A new heuristic algorithm for this problem based on compressed sensing. A [Formula: see text] algorithm that exactly solves the problem when the inputs are orthogonal matrices, a class that includes both permutations and genomes as special cases. CONCLUSIONS: Our work provides the first proof that, with respect to the rank distance, the problem of finding the median of 3 genomes, as well as the median of 3 permutations, is exactly solvable in polynomial time, a result which should be contrasted with its NP-hardness for the DCJ (double cut-and-join) distance and most other families of genome rearrangement operations. This result, backed by our experimental tests, indicates that the rank distance is a viable alternative to the DCJ distance widely used in genome comparisons. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2131-4) contains supplementary material, which is available to authorized users. BioMed Central 2018-05-08 /pmc/articles/PMC5998913/ /pubmed/29745865 http://dx.doi.org/10.1186/s12859-018-2131-4 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Chindelevitch, Leonid
Pereira Zanetti, João Paulo
Meidanis, João
On the rank-distance median of 3 permutations
title On the rank-distance median of 3 permutations
title_full On the rank-distance median of 3 permutations
title_fullStr On the rank-distance median of 3 permutations
title_full_unstemmed On the rank-distance median of 3 permutations
title_short On the rank-distance median of 3 permutations
title_sort on the rank-distance median of 3 permutations
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5998913/
https://www.ncbi.nlm.nih.gov/pubmed/29745865
http://dx.doi.org/10.1186/s12859-018-2131-4
work_keys_str_mv AT chindelevitchleonid ontherankdistancemedianof3permutations
AT pereirazanettijoaopaulo ontherankdistancemedianof3permutations
AT meidanisjoao ontherankdistancemedianof3permutations