Cargando…
Approximating the DCJ distance of balanced genomes in linear time
BACKGROUND: Rearrangements are large-scale mutations in genomes, responsible for complex changes and structural variations. Most rearrangements that modify the organization of a genome can be represented by the double cut and join (DCJ) operation. Given two balanced genomes, i.e., two genomes that h...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345319/ https://www.ncbi.nlm.nih.gov/pubmed/28293275 http://dx.doi.org/10.1186/s13015-017-0095-y |
_version_ | 1782513694574379008 |
---|---|
author | Rubert, Diego P. Feijão, Pedro Braga, Marília Dias Vieira Stoye, Jens Martinez, Fábio Henrique Viduani |
author_facet | Rubert, Diego P. Feijão, Pedro Braga, Marília Dias Vieira Stoye, Jens Martinez, Fábio Henrique Viduani |
author_sort | Rubert, Diego P. |
collection | PubMed |
description | BACKGROUND: Rearrangements are large-scale mutations in genomes, responsible for complex changes and structural variations. Most rearrangements that modify the organization of a genome can be represented by the double cut and join (DCJ) operation. Given two balanced genomes, i.e., two genomes that have exactly the same number of occurrences of each gene in each genome, we are interested in the problem of computing the rearrangement distance between them, i.e., finding the minimum number of DCJ operations that transform one genome into the other. This problem is known to be NP-hard. RESULTS: We propose a linear time approximation algorithm with approximation factor O(k) for the DCJ distance problem, where k is the maximum number of occurrences of any gene in the input genomes. Our algorithm works for linear and circular unichromosomal balanced genomes and uses as an intermediate step an O(k)-approximation for the minimum common string partition problem, which is closely related to the DCJ distance problem. CONCLUSIONS: Experiments on simulated data sets show that our approximation algorithm is very competitive both in efficiency and in quality of the solutions. |
format | Online Article Text |
id | pubmed-5345319 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-53453192017-03-14 Approximating the DCJ distance of balanced genomes in linear time Rubert, Diego P. Feijão, Pedro Braga, Marília Dias Vieira Stoye, Jens Martinez, Fábio Henrique Viduani Algorithms Mol Biol Research BACKGROUND: Rearrangements are large-scale mutations in genomes, responsible for complex changes and structural variations. Most rearrangements that modify the organization of a genome can be represented by the double cut and join (DCJ) operation. Given two balanced genomes, i.e., two genomes that have exactly the same number of occurrences of each gene in each genome, we are interested in the problem of computing the rearrangement distance between them, i.e., finding the minimum number of DCJ operations that transform one genome into the other. This problem is known to be NP-hard. RESULTS: We propose a linear time approximation algorithm with approximation factor O(k) for the DCJ distance problem, where k is the maximum number of occurrences of any gene in the input genomes. Our algorithm works for linear and circular unichromosomal balanced genomes and uses as an intermediate step an O(k)-approximation for the minimum common string partition problem, which is closely related to the DCJ distance problem. CONCLUSIONS: Experiments on simulated data sets show that our approximation algorithm is very competitive both in efficiency and in quality of the solutions. BioMed Central 2017-03-09 /pmc/articles/PMC5345319/ /pubmed/28293275 http://dx.doi.org/10.1186/s13015-017-0095-y Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Rubert, Diego P. Feijão, Pedro Braga, Marília Dias Vieira Stoye, Jens Martinez, Fábio Henrique Viduani Approximating the DCJ distance of balanced genomes in linear time |
title | Approximating the DCJ distance of balanced genomes in linear time |
title_full | Approximating the DCJ distance of balanced genomes in linear time |
title_fullStr | Approximating the DCJ distance of balanced genomes in linear time |
title_full_unstemmed | Approximating the DCJ distance of balanced genomes in linear time |
title_short | Approximating the DCJ distance of balanced genomes in linear time |
title_sort | approximating the dcj distance of balanced genomes in linear time |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345319/ https://www.ncbi.nlm.nih.gov/pubmed/28293275 http://dx.doi.org/10.1186/s13015-017-0095-y |
work_keys_str_mv | AT rubertdiegop approximatingthedcjdistanceofbalancedgenomesinlineartime AT feijaopedro approximatingthedcjdistanceofbalancedgenomesinlineartime AT bragamariliadiasvieira approximatingthedcjdistanceofbalancedgenomesinlineartime AT stoyejens approximatingthedcjdistanceofbalancedgenomesinlineartime AT martinezfabiohenriqueviduani approximatingthedcjdistanceofbalancedgenomesinlineartime |