Cargando…

The distance and median problems in the single-cut-or-join model with single-gene duplications

BACKGROUND. In the field of genome rearrangement algorithms, models accounting for gene duplication lead often to hard problems. For example, while computing the pairwise distance is tractable in most duplication-free models, the problem is NP-complete for most extensions of these models accounting...

Descripción completa

Detalles Bibliográficos
Autores principales: Mane, Aniket C., Lafond, Manuel, Feijao, Pedro C., Chauve, Cedric
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7197181/
https://www.ncbi.nlm.nih.gov/pubmed/32391071
http://dx.doi.org/10.1186/s13015-020-00169-y
_version_ 1783528831406047232
author Mane, Aniket C.
Lafond, Manuel
Feijao, Pedro C.
Chauve, Cedric
author_facet Mane, Aniket C.
Lafond, Manuel
Feijao, Pedro C.
Chauve, Cedric
author_sort Mane, Aniket C.
collection PubMed
description BACKGROUND. In the field of genome rearrangement algorithms, models accounting for gene duplication lead often to hard problems. For example, while computing the pairwise distance is tractable in most duplication-free models, the problem is NP-complete for most extensions of these models accounting for duplicated genes. Moreover, problems involving more than two genomes, such as the genome median and the Small Parsimony problem, are intractable for most duplication-free models, with some exceptions, for example the Single-Cut-or-Join (SCJ) model. RESULTS. We introduce a variant of the SCJ distance that accounts for duplicated genes, in the context of directed evolution from an ancestral genome to a descendant genome where orthology relations between ancestral genes and their descendant are known. Our model includes two duplication mechanisms: single-gene tandem duplication and the creation of single-gene circular chromosomes. We prove that in this model, computing the directed distance and a parsimonious evolutionary scenario in terms of SCJ and single-gene duplication events can be done in linear time. We also show that the directed median problem is tractable for this distance, while the rooted median problem, where we assume that one of the given genomes is ancestral to the median, is NP-complete. We also describe an Integer Linear Program for solving this problem. We evaluate the directed distance and rooted median algorithms on simulated data. CONCLUSION. Our results provide a simple genome rearrangement model, extending the SCJ model to account for single-gene duplications, for which we prove a mix of tractability and hardness results. For the NP-complete rooted median problem, we design a simple Integer Linear Program. Our publicly available implementation of these algorithms for the directed distance and median problems allow to solve efficiently these problems on large instances.
format Online
Article
Text
id pubmed-7197181
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-71971812020-05-08 The distance and median problems in the single-cut-or-join model with single-gene duplications Mane, Aniket C. Lafond, Manuel Feijao, Pedro C. Chauve, Cedric Algorithms Mol Biol Research BACKGROUND. In the field of genome rearrangement algorithms, models accounting for gene duplication lead often to hard problems. For example, while computing the pairwise distance is tractable in most duplication-free models, the problem is NP-complete for most extensions of these models accounting for duplicated genes. Moreover, problems involving more than two genomes, such as the genome median and the Small Parsimony problem, are intractable for most duplication-free models, with some exceptions, for example the Single-Cut-or-Join (SCJ) model. RESULTS. We introduce a variant of the SCJ distance that accounts for duplicated genes, in the context of directed evolution from an ancestral genome to a descendant genome where orthology relations between ancestral genes and their descendant are known. Our model includes two duplication mechanisms: single-gene tandem duplication and the creation of single-gene circular chromosomes. We prove that in this model, computing the directed distance and a parsimonious evolutionary scenario in terms of SCJ and single-gene duplication events can be done in linear time. We also show that the directed median problem is tractable for this distance, while the rooted median problem, where we assume that one of the given genomes is ancestral to the median, is NP-complete. We also describe an Integer Linear Program for solving this problem. We evaluate the directed distance and rooted median algorithms on simulated data. CONCLUSION. Our results provide a simple genome rearrangement model, extending the SCJ model to account for single-gene duplications, for which we prove a mix of tractability and hardness results. For the NP-complete rooted median problem, we design a simple Integer Linear Program. Our publicly available implementation of these algorithms for the directed distance and median problems allow to solve efficiently these problems on large instances. BioMed Central 2020-05-04 /pmc/articles/PMC7197181/ /pubmed/32391071 http://dx.doi.org/10.1186/s13015-020-00169-y Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Mane, Aniket C.
Lafond, Manuel
Feijao, Pedro C.
Chauve, Cedric
The distance and median problems in the single-cut-or-join model with single-gene duplications
title The distance and median problems in the single-cut-or-join model with single-gene duplications
title_full The distance and median problems in the single-cut-or-join model with single-gene duplications
title_fullStr The distance and median problems in the single-cut-or-join model with single-gene duplications
title_full_unstemmed The distance and median problems in the single-cut-or-join model with single-gene duplications
title_short The distance and median problems in the single-cut-or-join model with single-gene duplications
title_sort distance and median problems in the single-cut-or-join model with single-gene duplications
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7197181/
https://www.ncbi.nlm.nih.gov/pubmed/32391071
http://dx.doi.org/10.1186/s13015-020-00169-y
work_keys_str_mv AT maneaniketc thedistanceandmedianproblemsinthesinglecutorjoinmodelwithsinglegeneduplications
AT lafondmanuel thedistanceandmedianproblemsinthesinglecutorjoinmodelwithsinglegeneduplications
AT feijaopedroc thedistanceandmedianproblemsinthesinglecutorjoinmodelwithsinglegeneduplications
AT chauvecedric thedistanceandmedianproblemsinthesinglecutorjoinmodelwithsinglegeneduplications
AT maneaniketc distanceandmedianproblemsinthesinglecutorjoinmodelwithsinglegeneduplications
AT lafondmanuel distanceandmedianproblemsinthesinglecutorjoinmodelwithsinglegeneduplications
AT feijaopedroc distanceandmedianproblemsinthesinglecutorjoinmodelwithsinglegeneduplications
AT chauvecedric distanceandmedianproblemsinthesinglecutorjoinmodelwithsinglegeneduplications