Cargando…
PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees
BACKGROUND: Extant genomes share regions where genes have the same order and orientation, which are thought to arise from the conservation of an ancestral order of genes during evolution. Such regions of so-called conserved synteny, or synteny blocks, must be precisely identified and quantified, as...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4155083/ https://www.ncbi.nlm.nih.gov/pubmed/25103980 http://dx.doi.org/10.1186/1471-2105-15-268 |
_version_ | 1782333518421950464 |
---|---|
author | Lucas, Joseph MEX Muffato, Matthieu Crollius, Hugues Roest |
author_facet | Lucas, Joseph MEX Muffato, Matthieu Crollius, Hugues Roest |
author_sort | Lucas, Joseph MEX |
collection | PubMed |
description | BACKGROUND: Extant genomes share regions where genes have the same order and orientation, which are thought to arise from the conservation of an ancestral order of genes during evolution. Such regions of so-called conserved synteny, or synteny blocks, must be precisely identified and quantified, as a prerequisite to better understand the evolutionary history of genomes. RESULTS: Here we describe PhylDiag, a software that identifies statistically significant synteny blocks in pairwise comparisons of eukaryote genomes. Compared to previous methods, PhylDiag uses gene trees to define gene homologies, thus allowing gene deletions to be considered as events that may break the synteny. PhylDiag also accounts for gene orientations, blocks of tandem duplicates and lineage specific de novo gene births. Starting from two genomes and the corresponding gene trees, PhylDiag returns synteny blocks with gaps less than or equal to the maximum gap parameter gap(max). This parameter is theoretically estimated, and together with a utility to graphically display results, contributes to making PhylDiag a user friendly method. In addition, putative synteny blocks are subject to a statistical validation to verify that they are unlikely to be due to a random combination of genes. CONCLUSIONS: We benchmark several known metrics to measure 2D-distances in a matrix of homologies and we compare PhylDiag to i-ADHoRe 3.0 on real and simulated data. We show that PhylDiag correctly identifies small synteny blocks even with insertions, deletions, incorrect annotations or micro-inversions. Finally, PhylDiag allowed us to identify the most relevant distance metric for 2D-distance calculation between homologies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2105-15-268) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4155083 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-41550832014-09-06 PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees Lucas, Joseph MEX Muffato, Matthieu Crollius, Hugues Roest BMC Bioinformatics Research Article BACKGROUND: Extant genomes share regions where genes have the same order and orientation, which are thought to arise from the conservation of an ancestral order of genes during evolution. Such regions of so-called conserved synteny, or synteny blocks, must be precisely identified and quantified, as a prerequisite to better understand the evolutionary history of genomes. RESULTS: Here we describe PhylDiag, a software that identifies statistically significant synteny blocks in pairwise comparisons of eukaryote genomes. Compared to previous methods, PhylDiag uses gene trees to define gene homologies, thus allowing gene deletions to be considered as events that may break the synteny. PhylDiag also accounts for gene orientations, blocks of tandem duplicates and lineage specific de novo gene births. Starting from two genomes and the corresponding gene trees, PhylDiag returns synteny blocks with gaps less than or equal to the maximum gap parameter gap(max). This parameter is theoretically estimated, and together with a utility to graphically display results, contributes to making PhylDiag a user friendly method. In addition, putative synteny blocks are subject to a statistical validation to verify that they are unlikely to be due to a random combination of genes. CONCLUSIONS: We benchmark several known metrics to measure 2D-distances in a matrix of homologies and we compare PhylDiag to i-ADHoRe 3.0 on real and simulated data. We show that PhylDiag correctly identifies small synteny blocks even with insertions, deletions, incorrect annotations or micro-inversions. Finally, PhylDiag allowed us to identify the most relevant distance metric for 2D-distance calculation between homologies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2105-15-268) contains supplementary material, which is available to authorized users. BioMed Central 2014-08-08 /pmc/articles/PMC4155083/ /pubmed/25103980 http://dx.doi.org/10.1186/1471-2105-15-268 Text en © Lucas et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Lucas, Joseph MEX Muffato, Matthieu Crollius, Hugues Roest PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees |
title | PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees |
title_full | PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees |
title_fullStr | PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees |
title_full_unstemmed | PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees |
title_short | PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees |
title_sort | phyldiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4155083/ https://www.ncbi.nlm.nih.gov/pubmed/25103980 http://dx.doi.org/10.1186/1471-2105-15-268 |
work_keys_str_mv | AT lucasjosephmex phyldiagidentifyingcomplexsyntenyblocksthatincludetandemduplicationsusingphylogeneticgenetrees AT muffatomatthieu phyldiagidentifyingcomplexsyntenyblocksthatincludetandemduplicationsusingphylogeneticgenetrees AT crolliushuguesroest phyldiagidentifyingcomplexsyntenyblocksthatincludetandemduplicationsusingphylogeneticgenetrees |