Cargando…

Approximating the DCJ distance of balanced genomes in linear time

BACKGROUND: Rearrangements are large-scale mutations in genomes, responsible for complex changes and structural variations. Most rearrangements that modify the organization of a genome can be represented by the double cut and join (DCJ) operation. Given two balanced genomes, i.e., two genomes that h...

Descripción completa

Detalles Bibliográficos
Autores principales: Rubert, Diego P., Feijão, Pedro, Braga, Marília Dias Vieira, Stoye, Jens, Martinez, Fábio Henrique Viduani
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345319/
https://www.ncbi.nlm.nih.gov/pubmed/28293275
http://dx.doi.org/10.1186/s13015-017-0095-y
_version_ 1782513694574379008
author Rubert, Diego P.
Feijão, Pedro
Braga, Marília Dias Vieira
Stoye, Jens
Martinez, Fábio Henrique Viduani
author_facet Rubert, Diego P.
Feijão, Pedro
Braga, Marília Dias Vieira
Stoye, Jens
Martinez, Fábio Henrique Viduani
author_sort Rubert, Diego P.
collection PubMed
description BACKGROUND: Rearrangements are large-scale mutations in genomes, responsible for complex changes and structural variations. Most rearrangements that modify the organization of a genome can be represented by the double cut and join (DCJ) operation. Given two balanced genomes, i.e., two genomes that have exactly the same number of occurrences of each gene in each genome, we are interested in the problem of computing the rearrangement distance between them, i.e., finding the minimum number of DCJ operations that transform one genome into the other. This problem is known to be NP-hard. RESULTS: We propose a linear time approximation algorithm with approximation factor O(k) for the DCJ distance problem, where k is the maximum number of occurrences of any gene in the input genomes. Our algorithm works for linear and circular unichromosomal balanced genomes and uses as an intermediate step an O(k)-approximation for the minimum common string partition problem, which is closely related to the DCJ distance problem. CONCLUSIONS: Experiments on simulated data sets show that our approximation algorithm is very competitive both in efficiency and in quality of the solutions.
format Online
Article
Text
id pubmed-5345319
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53453192017-03-14 Approximating the DCJ distance of balanced genomes in linear time Rubert, Diego P. Feijão, Pedro Braga, Marília Dias Vieira Stoye, Jens Martinez, Fábio Henrique Viduani Algorithms Mol Biol Research BACKGROUND: Rearrangements are large-scale mutations in genomes, responsible for complex changes and structural variations. Most rearrangements that modify the organization of a genome can be represented by the double cut and join (DCJ) operation. Given two balanced genomes, i.e., two genomes that have exactly the same number of occurrences of each gene in each genome, we are interested in the problem of computing the rearrangement distance between them, i.e., finding the minimum number of DCJ operations that transform one genome into the other. This problem is known to be NP-hard. RESULTS: We propose a linear time approximation algorithm with approximation factor O(k) for the DCJ distance problem, where k is the maximum number of occurrences of any gene in the input genomes. Our algorithm works for linear and circular unichromosomal balanced genomes and uses as an intermediate step an O(k)-approximation for the minimum common string partition problem, which is closely related to the DCJ distance problem. CONCLUSIONS: Experiments on simulated data sets show that our approximation algorithm is very competitive both in efficiency and in quality of the solutions. BioMed Central 2017-03-09 /pmc/articles/PMC5345319/ /pubmed/28293275 http://dx.doi.org/10.1186/s13015-017-0095-y Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Rubert, Diego P.
Feijão, Pedro
Braga, Marília Dias Vieira
Stoye, Jens
Martinez, Fábio Henrique Viduani
Approximating the DCJ distance of balanced genomes in linear time
title Approximating the DCJ distance of balanced genomes in linear time
title_full Approximating the DCJ distance of balanced genomes in linear time
title_fullStr Approximating the DCJ distance of balanced genomes in linear time
title_full_unstemmed Approximating the DCJ distance of balanced genomes in linear time
title_short Approximating the DCJ distance of balanced genomes in linear time
title_sort approximating the dcj distance of balanced genomes in linear time
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345319/
https://www.ncbi.nlm.nih.gov/pubmed/28293275
http://dx.doi.org/10.1186/s13015-017-0095-y
work_keys_str_mv AT rubertdiegop approximatingthedcjdistanceofbalancedgenomesinlineartime
AT feijaopedro approximatingthedcjdistanceofbalancedgenomesinlineartime
AT bragamariliadiasvieira approximatingthedcjdistanceofbalancedgenomesinlineartime
AT stoyejens approximatingthedcjdistanceofbalancedgenomesinlineartime
AT martinezfabiohenriqueviduani approximatingthedcjdistanceofbalancedgenomesinlineartime