Cargando…

A flexible ancestral genome reconstruction method based on gapped adjacencies

BACKGROUND: The "small phylogeny" problem consists in inferring ancestral genomes associated with each internal node of a phylogenetic tree of a set of extant species. Existing methods can be grouped into two main categories: the distance-based methods aiming at minimizing a total branch l...

Descripción completa

Detalles Bibliográficos
Autores principales: Gagnon, Yves, Blanchette, Mathieu, El-Mabrouk, Nadia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3526437/
https://www.ncbi.nlm.nih.gov/pubmed/23281872
http://dx.doi.org/10.1186/1471-2105-13-S19-S4
_version_ 1782253560198594560
author Gagnon, Yves
Blanchette, Mathieu
El-Mabrouk, Nadia
author_facet Gagnon, Yves
Blanchette, Mathieu
El-Mabrouk, Nadia
author_sort Gagnon, Yves
collection PubMed
description BACKGROUND: The "small phylogeny" problem consists in inferring ancestral genomes associated with each internal node of a phylogenetic tree of a set of extant species. Existing methods can be grouped into two main categories: the distance-based methods aiming at minimizing a total branch length, and the synteny-based (or mapping) methods that first predict a collection of relations between ancestral markers in term of "synteny", and then assemble this collection into a set of Contiguous Ancestral Regions (CARs). The predicted CARs are likely to be more reliable as they are more directly deduced from observed conservations in extant species. However the challenge is to end up with a completely assembled genome. RESULTS: We develop a new synteny-based method that is flexible enough to handle a model of evolution involving whole genome duplication events, in addition to rearrangements, gene insertions, and losses. Ancestral relationships between markers are defined in term of Gapped Adjacencies, i.e. pairs of markers separated by up to a given number of markers. It improves on a previous restricted to direct adjacencies, which revealed a high accuracy for adjacency prediction, but with the drawback of being overly conservative, i.e. of generating a large number of CARs. Applying our algorithm on various simulated data sets reveals good performance as we usually end up with a completely assembled genome, while keeping a low error rate. AVAILABILITY: All source code is available at http://www.iro.umontreal.ca/~mabrouk.
format Online
Article
Text
id pubmed-3526437
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35264372013-01-10 A flexible ancestral genome reconstruction method based on gapped adjacencies Gagnon, Yves Blanchette, Mathieu El-Mabrouk, Nadia BMC Bioinformatics Proceedings BACKGROUND: The "small phylogeny" problem consists in inferring ancestral genomes associated with each internal node of a phylogenetic tree of a set of extant species. Existing methods can be grouped into two main categories: the distance-based methods aiming at minimizing a total branch length, and the synteny-based (or mapping) methods that first predict a collection of relations between ancestral markers in term of "synteny", and then assemble this collection into a set of Contiguous Ancestral Regions (CARs). The predicted CARs are likely to be more reliable as they are more directly deduced from observed conservations in extant species. However the challenge is to end up with a completely assembled genome. RESULTS: We develop a new synteny-based method that is flexible enough to handle a model of evolution involving whole genome duplication events, in addition to rearrangements, gene insertions, and losses. Ancestral relationships between markers are defined in term of Gapped Adjacencies, i.e. pairs of markers separated by up to a given number of markers. It improves on a previous restricted to direct adjacencies, which revealed a high accuracy for adjacency prediction, but with the drawback of being overly conservative, i.e. of generating a large number of CARs. Applying our algorithm on various simulated data sets reveals good performance as we usually end up with a completely assembled genome, while keeping a low error rate. AVAILABILITY: All source code is available at http://www.iro.umontreal.ca/~mabrouk. BioMed Central 2012-12-19 /pmc/articles/PMC3526437/ /pubmed/23281872 http://dx.doi.org/10.1186/1471-2105-13-S19-S4 Text en Copyright ©2012 Gagnon et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Gagnon, Yves
Blanchette, Mathieu
El-Mabrouk, Nadia
A flexible ancestral genome reconstruction method based on gapped adjacencies
title A flexible ancestral genome reconstruction method based on gapped adjacencies
title_full A flexible ancestral genome reconstruction method based on gapped adjacencies
title_fullStr A flexible ancestral genome reconstruction method based on gapped adjacencies
title_full_unstemmed A flexible ancestral genome reconstruction method based on gapped adjacencies
title_short A flexible ancestral genome reconstruction method based on gapped adjacencies
title_sort flexible ancestral genome reconstruction method based on gapped adjacencies
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3526437/
https://www.ncbi.nlm.nih.gov/pubmed/23281872
http://dx.doi.org/10.1186/1471-2105-13-S19-S4
work_keys_str_mv AT gagnonyves aflexibleancestralgenomereconstructionmethodbasedongappedadjacencies
AT blanchettemathieu aflexibleancestralgenomereconstructionmethodbasedongappedadjacencies
AT elmabrouknadia aflexibleancestralgenomereconstructionmethodbasedongappedadjacencies
AT gagnonyves flexibleancestralgenomereconstructionmethodbasedongappedadjacencies
AT blanchettemathieu flexibleancestralgenomereconstructionmethodbasedongappedadjacencies
AT elmabrouknadia flexibleancestralgenomereconstructionmethodbasedongappedadjacencies