Cargando…

PhyloPat: phylogenetic pattern analysis of eukaryotic genes

BACKGROUND: Phylogenetic patterns show the presence or absence of certain genes or proteins in a set of species. They can also be used to determine sets of genes or proteins that occur only in certain evolutionary branches. Phylogenetic patterns analysis has routinely been applied to protein databas...

Descripción completa

Detalles Bibliográficos
Autores principales: Hulsen, Tim, de Vlieg, Jacob, Groenen, Peter MA
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1570148/
https://www.ncbi.nlm.nih.gov/pubmed/16948844
http://dx.doi.org/10.1186/1471-2105-7-398
_version_ 1782130250166042624
author Hulsen, Tim
de Vlieg, Jacob
Groenen, Peter MA
author_facet Hulsen, Tim
de Vlieg, Jacob
Groenen, Peter MA
author_sort Hulsen, Tim
collection PubMed
description BACKGROUND: Phylogenetic patterns show the presence or absence of certain genes or proteins in a set of species. They can also be used to determine sets of genes or proteins that occur only in certain evolutionary branches. Phylogenetic patterns analysis has routinely been applied to protein databases such as COG and OrthoMCL, but not upon gene databases. Here we present a tool named PhyloPat which allows the complete Ensembl gene database to be queried using phylogenetic patterns. DESCRIPTION: PhyloPat is an easy-to-use webserver, which can be used to query the orthologies of all complete genomes within the EnsMart database using phylogenetic patterns. This enables the determination of sets of genes that occur only in certain evolutionary branches or even single species. We found in total 446,825 genes and 3,164,088 orthologous relationships within the EnsMart v40 database. We used a single linkage clustering algorithm to create 147,922 phylogenetic lineages, using every one of the orthologies provided by Ensembl. PhyloPat provides the possibility of querying with either binary phylogenetic patterns (created by checkboxes) or regular expressions. Specific branches of a phylogenetic tree of the 21 included species can be selected to create a branch-specific phylogenetic pattern. Users can also input a list of Ensembl or EMBL IDs to check which phylogenetic lineage any gene belongs to. The output can be saved in HTML, Excel or plain text format for further analysis. A link to the FatiGO web interface has been incorporated in the HTML output, creating easy access to functional information. Finally, lists of omnipresent, polypresent and oligopresent genes have been included. CONCLUSION: PhyloPat is the first tool to combine complete genome information with phylogenetic pattern querying. Since we used the orthologies generated by the accurate pipeline of Ensembl, the obtained phylogenetic lineages are reliable. The completeness and reliability of these phylogenetic lineages will further increase with the addition of newly found orthologous relationships within each new Ensembl release.
format Text
id pubmed-1570148
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15701482006-09-19 PhyloPat: phylogenetic pattern analysis of eukaryotic genes Hulsen, Tim de Vlieg, Jacob Groenen, Peter MA BMC Bioinformatics Database BACKGROUND: Phylogenetic patterns show the presence or absence of certain genes or proteins in a set of species. They can also be used to determine sets of genes or proteins that occur only in certain evolutionary branches. Phylogenetic patterns analysis has routinely been applied to protein databases such as COG and OrthoMCL, but not upon gene databases. Here we present a tool named PhyloPat which allows the complete Ensembl gene database to be queried using phylogenetic patterns. DESCRIPTION: PhyloPat is an easy-to-use webserver, which can be used to query the orthologies of all complete genomes within the EnsMart database using phylogenetic patterns. This enables the determination of sets of genes that occur only in certain evolutionary branches or even single species. We found in total 446,825 genes and 3,164,088 orthologous relationships within the EnsMart v40 database. We used a single linkage clustering algorithm to create 147,922 phylogenetic lineages, using every one of the orthologies provided by Ensembl. PhyloPat provides the possibility of querying with either binary phylogenetic patterns (created by checkboxes) or regular expressions. Specific branches of a phylogenetic tree of the 21 included species can be selected to create a branch-specific phylogenetic pattern. Users can also input a list of Ensembl or EMBL IDs to check which phylogenetic lineage any gene belongs to. The output can be saved in HTML, Excel or plain text format for further analysis. A link to the FatiGO web interface has been incorporated in the HTML output, creating easy access to functional information. Finally, lists of omnipresent, polypresent and oligopresent genes have been included. CONCLUSION: PhyloPat is the first tool to combine complete genome information with phylogenetic pattern querying. Since we used the orthologies generated by the accurate pipeline of Ensembl, the obtained phylogenetic lineages are reliable. The completeness and reliability of these phylogenetic lineages will further increase with the addition of newly found orthologous relationships within each new Ensembl release. BioMed Central 2006-09-01 /pmc/articles/PMC1570148/ /pubmed/16948844 http://dx.doi.org/10.1186/1471-2105-7-398 Text en Copyright © 2006 Hulsen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database
Hulsen, Tim
de Vlieg, Jacob
Groenen, Peter MA
PhyloPat: phylogenetic pattern analysis of eukaryotic genes
title PhyloPat: phylogenetic pattern analysis of eukaryotic genes
title_full PhyloPat: phylogenetic pattern analysis of eukaryotic genes
title_fullStr PhyloPat: phylogenetic pattern analysis of eukaryotic genes
title_full_unstemmed PhyloPat: phylogenetic pattern analysis of eukaryotic genes
title_short PhyloPat: phylogenetic pattern analysis of eukaryotic genes
title_sort phylopat: phylogenetic pattern analysis of eukaryotic genes
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1570148/
https://www.ncbi.nlm.nih.gov/pubmed/16948844
http://dx.doi.org/10.1186/1471-2105-7-398
work_keys_str_mv AT hulsentim phylopatphylogeneticpatternanalysisofeukaryoticgenes
AT devliegjacob phylopatphylogeneticpatternanalysisofeukaryoticgenes
AT groenenpeterma phylopatphylogeneticpatternanalysisofeukaryoticgenes