Cargando…

OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups

The OrthoMCL database () houses ortholog group predictions for 55 species, including 16 bacterial and 4 archaeal genomes representing phylogenetically diverse lineages, and most currently available complete eukaryotic genomes: 24 unikonts (12 animals, 9 fungi, microsporidium, Dictyostelium, Entamoeb...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Feng, Mackey, Aaron J., Stoeckert, Christian J., Roos, David S.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1347485/
https://www.ncbi.nlm.nih.gov/pubmed/16381887
http://dx.doi.org/10.1093/nar/gkj123
_version_ 1782126630270926848
author Chen, Feng
Mackey, Aaron J.
Stoeckert, Christian J.
Roos, David S.
author_facet Chen, Feng
Mackey, Aaron J.
Stoeckert, Christian J.
Roos, David S.
author_sort Chen, Feng
collection PubMed
description The OrthoMCL database () houses ortholog group predictions for 55 species, including 16 bacterial and 4 archaeal genomes representing phylogenetically diverse lineages, and most currently available complete eukaryotic genomes: 24 unikonts (12 animals, 9 fungi, microsporidium, Dictyostelium, Entamoeba), 4 plants/algae and 7 apicomplexan parasites. OrthoMCL software was used to cluster proteins based on sequence similarity, using an all-against-all BLAST search of each species' proteome, followed by normalization of inter-species differences, and Markov clustering. A total of 511 797 proteins (81.6% of the total dataset) were clustered into 70 388 ortholog groups. The ortholog database may be queried based on protein or group accession numbers, keyword descriptions or BLAST similarity. Ortholog groups exhibiting specific phyletic patterns may also be identified, using either a graphical interface or a text-based Phyletic Pattern Expression grammar. Information for ortholog groups includes the phyletic profile, the list of member proteins and a multiple sequence alignment, a statistical summary and graphical view of similarities, and a graphical representation of domain architecture. OrthoMCL software, the entire FASTA dataset employed and clustering results are available for download. OrthoMCL-DB provides a centralized warehouse for orthology prediction among multiple species, and will be updated and expanded as additional genome sequence data become available.
format Text
id pubmed-1347485
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-13474852006-01-25 OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups Chen, Feng Mackey, Aaron J. Stoeckert, Christian J. Roos, David S. Nucleic Acids Res Article The OrthoMCL database () houses ortholog group predictions for 55 species, including 16 bacterial and 4 archaeal genomes representing phylogenetically diverse lineages, and most currently available complete eukaryotic genomes: 24 unikonts (12 animals, 9 fungi, microsporidium, Dictyostelium, Entamoeba), 4 plants/algae and 7 apicomplexan parasites. OrthoMCL software was used to cluster proteins based on sequence similarity, using an all-against-all BLAST search of each species' proteome, followed by normalization of inter-species differences, and Markov clustering. A total of 511 797 proteins (81.6% of the total dataset) were clustered into 70 388 ortholog groups. The ortholog database may be queried based on protein or group accession numbers, keyword descriptions or BLAST similarity. Ortholog groups exhibiting specific phyletic patterns may also be identified, using either a graphical interface or a text-based Phyletic Pattern Expression grammar. Information for ortholog groups includes the phyletic profile, the list of member proteins and a multiple sequence alignment, a statistical summary and graphical view of similarities, and a graphical representation of domain architecture. OrthoMCL software, the entire FASTA dataset employed and clustering results are available for download. OrthoMCL-DB provides a centralized warehouse for orthology prediction among multiple species, and will be updated and expanded as additional genome sequence data become available. Oxford University Press 2006-01-01 2005-12-28 /pmc/articles/PMC1347485/ /pubmed/16381887 http://dx.doi.org/10.1093/nar/gkj123 Text en © The Author 2006. Published by Oxford University Press. All rights reserved
spellingShingle Article
Chen, Feng
Mackey, Aaron J.
Stoeckert, Christian J.
Roos, David S.
OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups
title OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups
title_full OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups
title_fullStr OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups
title_full_unstemmed OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups
title_short OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups
title_sort orthomcl-db: querying a comprehensive multi-species collection of ortholog groups
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1347485/
https://www.ncbi.nlm.nih.gov/pubmed/16381887
http://dx.doi.org/10.1093/nar/gkj123
work_keys_str_mv AT chenfeng orthomcldbqueryingacomprehensivemultispeciescollectionoforthologgroups
AT mackeyaaronj orthomcldbqueryingacomprehensivemultispeciescollectionoforthologgroups
AT stoeckertchristianj orthomcldbqueryingacomprehensivemultispeciescollectionoforthologgroups
AT roosdavids orthomcldbqueryingacomprehensivemultispeciescollectionoforthologgroups