Cargando…

PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees

BACKGROUND: Extant genomes share regions where genes have the same order and orientation, which are thought to arise from the conservation of an ancestral order of genes during evolution. Such regions of so-called conserved synteny, or synteny blocks, must be precisely identified and quantified, as...

Descripción completa

Detalles Bibliográficos
Autores principales: Lucas, Joseph MEX, Muffato, Matthieu, Crollius, Hugues Roest
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4155083/
https://www.ncbi.nlm.nih.gov/pubmed/25103980
http://dx.doi.org/10.1186/1471-2105-15-268
_version_ 1782333518421950464
author Lucas, Joseph MEX
Muffato, Matthieu
Crollius, Hugues Roest
author_facet Lucas, Joseph MEX
Muffato, Matthieu
Crollius, Hugues Roest
author_sort Lucas, Joseph MEX
collection PubMed
description BACKGROUND: Extant genomes share regions where genes have the same order and orientation, which are thought to arise from the conservation of an ancestral order of genes during evolution. Such regions of so-called conserved synteny, or synteny blocks, must be precisely identified and quantified, as a prerequisite to better understand the evolutionary history of genomes. RESULTS: Here we describe PhylDiag, a software that identifies statistically significant synteny blocks in pairwise comparisons of eukaryote genomes. Compared to previous methods, PhylDiag uses gene trees to define gene homologies, thus allowing gene deletions to be considered as events that may break the synteny. PhylDiag also accounts for gene orientations, blocks of tandem duplicates and lineage specific de novo gene births. Starting from two genomes and the corresponding gene trees, PhylDiag returns synteny blocks with gaps less than or equal to the maximum gap parameter gap(max). This parameter is theoretically estimated, and together with a utility to graphically display results, contributes to making PhylDiag a user friendly method. In addition, putative synteny blocks are subject to a statistical validation to verify that they are unlikely to be due to a random combination of genes. CONCLUSIONS: We benchmark several known metrics to measure 2D-distances in a matrix of homologies and we compare PhylDiag to i-ADHoRe 3.0 on real and simulated data. We show that PhylDiag correctly identifies small synteny blocks even with insertions, deletions, incorrect annotations or micro-inversions. Finally, PhylDiag allowed us to identify the most relevant distance metric for 2D-distance calculation between homologies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2105-15-268) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4155083
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41550832014-09-06 PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees Lucas, Joseph MEX Muffato, Matthieu Crollius, Hugues Roest BMC Bioinformatics Research Article BACKGROUND: Extant genomes share regions where genes have the same order and orientation, which are thought to arise from the conservation of an ancestral order of genes during evolution. Such regions of so-called conserved synteny, or synteny blocks, must be precisely identified and quantified, as a prerequisite to better understand the evolutionary history of genomes. RESULTS: Here we describe PhylDiag, a software that identifies statistically significant synteny blocks in pairwise comparisons of eukaryote genomes. Compared to previous methods, PhylDiag uses gene trees to define gene homologies, thus allowing gene deletions to be considered as events that may break the synteny. PhylDiag also accounts for gene orientations, blocks of tandem duplicates and lineage specific de novo gene births. Starting from two genomes and the corresponding gene trees, PhylDiag returns synteny blocks with gaps less than or equal to the maximum gap parameter gap(max). This parameter is theoretically estimated, and together with a utility to graphically display results, contributes to making PhylDiag a user friendly method. In addition, putative synteny blocks are subject to a statistical validation to verify that they are unlikely to be due to a random combination of genes. CONCLUSIONS: We benchmark several known metrics to measure 2D-distances in a matrix of homologies and we compare PhylDiag to i-ADHoRe 3.0 on real and simulated data. We show that PhylDiag correctly identifies small synteny blocks even with insertions, deletions, incorrect annotations or micro-inversions. Finally, PhylDiag allowed us to identify the most relevant distance metric for 2D-distance calculation between homologies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2105-15-268) contains supplementary material, which is available to authorized users. BioMed Central 2014-08-08 /pmc/articles/PMC4155083/ /pubmed/25103980 http://dx.doi.org/10.1186/1471-2105-15-268 Text en © Lucas et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Lucas, Joseph MEX
Muffato, Matthieu
Crollius, Hugues Roest
PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees
title PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees
title_full PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees
title_fullStr PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees
title_full_unstemmed PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees
title_short PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees
title_sort phyldiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4155083/
https://www.ncbi.nlm.nih.gov/pubmed/25103980
http://dx.doi.org/10.1186/1471-2105-15-268
work_keys_str_mv AT lucasjosephmex phyldiagidentifyingcomplexsyntenyblocksthatincludetandemduplicationsusingphylogeneticgenetrees
AT muffatomatthieu phyldiagidentifyingcomplexsyntenyblocksthatincludetandemduplicationsusingphylogeneticgenetrees
AT crolliushuguesroest phyldiagidentifyingcomplexsyntenyblocksthatincludetandemduplicationsusingphylogeneticgenetrees