Cargando…
Comparing genomes with rearrangements and segmental duplications
Motivation: Large-scale evolutionary events such as genomic rearrange.ments and segmental duplications form an important part of the evolution of genomes and are widely studied from both biological and computational perspectives. A basic computational problem is to infer these events in the evolutio...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765876/ https://www.ncbi.nlm.nih.gov/pubmed/26072500 http://dx.doi.org/10.1093/bioinformatics/btv229 |
_version_ | 1782417588087685120 |
---|---|
author | Shao, Mingfu Moret, Bernard M.E. |
author_facet | Shao, Mingfu Moret, Bernard M.E. |
author_sort | Shao, Mingfu |
collection | PubMed |
description | Motivation: Large-scale evolutionary events such as genomic rearrange.ments and segmental duplications form an important part of the evolution of genomes and are widely studied from both biological and computational perspectives. A basic computational problem is to infer these events in the evolutionary history for given modern genomes, a task for which many algorithms have been proposed under various constraints. Algorithms that can handle both rearrangements and content-modifying events such as duplications and losses remain few and limited in their applicability. Results: We study the comparison of two genomes under a model including general rearrangements (through double-cut-and-join) and segmental duplications. We formulate the comparison as an optimization problem and describe an exact algorithm to solve it by using an integer linear program. We also devise a sufficient condition and an efficient algorithm to identify optimal substructures, which can simplify the problem while preserving optimality. Using the optimal substructures with the integer linear program (ILP) formulation yields a practical and exact algorithm to solve the problem. We then apply our algorithm to assign in-paralogs and orthologs (a necessary step in handling duplications) and compare its performance with that of the state-of-the-art method MSOAR, using both simulations and real data. On simulated datasets, our method outperforms MSOAR by a significant margin, and on five well-annotated species, MSOAR achieves high accuracy, yet our method performs slightly better on each of the 10 pairwise comparisons. Availability and implementation: http://lcbb.epfl.ch/softwares/coser. Contact: mingfu.shao@epfl.ch or bernard.moret@epfl.ch |
format | Online Article Text |
id | pubmed-4765876 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-47658762016-03-04 Comparing genomes with rearrangements and segmental duplications Shao, Mingfu Moret, Bernard M.E. Bioinformatics Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland Motivation: Large-scale evolutionary events such as genomic rearrange.ments and segmental duplications form an important part of the evolution of genomes and are widely studied from both biological and computational perspectives. A basic computational problem is to infer these events in the evolutionary history for given modern genomes, a task for which many algorithms have been proposed under various constraints. Algorithms that can handle both rearrangements and content-modifying events such as duplications and losses remain few and limited in their applicability. Results: We study the comparison of two genomes under a model including general rearrangements (through double-cut-and-join) and segmental duplications. We formulate the comparison as an optimization problem and describe an exact algorithm to solve it by using an integer linear program. We also devise a sufficient condition and an efficient algorithm to identify optimal substructures, which can simplify the problem while preserving optimality. Using the optimal substructures with the integer linear program (ILP) formulation yields a practical and exact algorithm to solve the problem. We then apply our algorithm to assign in-paralogs and orthologs (a necessary step in handling duplications) and compare its performance with that of the state-of-the-art method MSOAR, using both simulations and real data. On simulated datasets, our method outperforms MSOAR by a significant margin, and on five well-annotated species, MSOAR achieves high accuracy, yet our method performs slightly better on each of the 10 pairwise comparisons. Availability and implementation: http://lcbb.epfl.ch/softwares/coser. Contact: mingfu.shao@epfl.ch or bernard.moret@epfl.ch Oxford University Press 2015-06-15 2015-06-10 /pmc/articles/PMC4765876/ /pubmed/26072500 http://dx.doi.org/10.1093/bioinformatics/btv229 Text en © The Author 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland Shao, Mingfu Moret, Bernard M.E. Comparing genomes with rearrangements and segmental duplications |
title | Comparing genomes with rearrangements and segmental duplications |
title_full | Comparing genomes with rearrangements and segmental duplications |
title_fullStr | Comparing genomes with rearrangements and segmental duplications |
title_full_unstemmed | Comparing genomes with rearrangements and segmental duplications |
title_short | Comparing genomes with rearrangements and segmental duplications |
title_sort | comparing genomes with rearrangements and segmental duplications |
topic | Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765876/ https://www.ncbi.nlm.nih.gov/pubmed/26072500 http://dx.doi.org/10.1093/bioinformatics/btv229 |
work_keys_str_mv | AT shaomingfu comparinggenomeswithrearrangementsandsegmentalduplications AT moretbernardme comparinggenomeswithrearrangementsandsegmentalduplications |