Cargando…

Proteinortho: Detection of (Co-)orthologs in large-scale analysis

BACKGROUND: Orthology analysis is an important part of data analysis in many areas of bioinformatics such as comparative genomics and molecular phylogenetics. The ever-increasing flood of sequence data, and hence the rapidly increasing number of genomes that can be compared simultaneously, calls for...

Descripción completa

Detalles Bibliográficos
Autores principales: Lechner, Marcus, Findeiß, Sven, Steiner, Lydia, Marz, Manja, Stadler, Peter F, Prohaska, Sonja J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3114741/
https://www.ncbi.nlm.nih.gov/pubmed/21526987
http://dx.doi.org/10.1186/1471-2105-12-124
_version_ 1782206102857842688
author Lechner, Marcus
Findeiß, Sven
Steiner, Lydia
Marz, Manja
Stadler, Peter F
Prohaska, Sonja J
author_facet Lechner, Marcus
Findeiß, Sven
Steiner, Lydia
Marz, Manja
Stadler, Peter F
Prohaska, Sonja J
author_sort Lechner, Marcus
collection PubMed
description BACKGROUND: Orthology analysis is an important part of data analysis in many areas of bioinformatics such as comparative genomics and molecular phylogenetics. The ever-increasing flood of sequence data, and hence the rapidly increasing number of genomes that can be compared simultaneously, calls for efficient software tools as brute-force approaches with quadratic memory requirements become infeasible in practise. The rapid pace at which new data become available, furthermore, makes it desirable to compute genome-wide orthology relations for a given dataset rather than relying on relations listed in databases. RESULTS: The program Proteinortho described here is a stand-alone tool that is geared towards large datasets and makes use of distributed computing techniques when run on multi-core hardware. It implements an extended version of the reciprocal best alignment heuristic. We apply Proteinortho to compute orthologous proteins in the complete set of all 717 eubacterial genomes available at NCBI at the beginning of 2009. We identified thirty proteins present in 99% of all bacterial proteomes. CONCLUSIONS: Proteinortho significantly reduces the required amount of memory for orthology analysis compared to existing tools, allowing such computations to be performed on off-the-shelf hardware.
format Online
Article
Text
id pubmed-3114741
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31147412011-06-15 Proteinortho: Detection of (Co-)orthologs in large-scale analysis Lechner, Marcus Findeiß, Sven Steiner, Lydia Marz, Manja Stadler, Peter F Prohaska, Sonja J BMC Bioinformatics Methodology Article BACKGROUND: Orthology analysis is an important part of data analysis in many areas of bioinformatics such as comparative genomics and molecular phylogenetics. The ever-increasing flood of sequence data, and hence the rapidly increasing number of genomes that can be compared simultaneously, calls for efficient software tools as brute-force approaches with quadratic memory requirements become infeasible in practise. The rapid pace at which new data become available, furthermore, makes it desirable to compute genome-wide orthology relations for a given dataset rather than relying on relations listed in databases. RESULTS: The program Proteinortho described here is a stand-alone tool that is geared towards large datasets and makes use of distributed computing techniques when run on multi-core hardware. It implements an extended version of the reciprocal best alignment heuristic. We apply Proteinortho to compute orthologous proteins in the complete set of all 717 eubacterial genomes available at NCBI at the beginning of 2009. We identified thirty proteins present in 99% of all bacterial proteomes. CONCLUSIONS: Proteinortho significantly reduces the required amount of memory for orthology analysis compared to existing tools, allowing such computations to be performed on off-the-shelf hardware. BioMed Central 2011-04-28 /pmc/articles/PMC3114741/ /pubmed/21526987 http://dx.doi.org/10.1186/1471-2105-12-124 Text en Copyright ©2011 Lechner et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Lechner, Marcus
Findeiß, Sven
Steiner, Lydia
Marz, Manja
Stadler, Peter F
Prohaska, Sonja J
Proteinortho: Detection of (Co-)orthologs in large-scale analysis
title Proteinortho: Detection of (Co-)orthologs in large-scale analysis
title_full Proteinortho: Detection of (Co-)orthologs in large-scale analysis
title_fullStr Proteinortho: Detection of (Co-)orthologs in large-scale analysis
title_full_unstemmed Proteinortho: Detection of (Co-)orthologs in large-scale analysis
title_short Proteinortho: Detection of (Co-)orthologs in large-scale analysis
title_sort proteinortho: detection of (co-)orthologs in large-scale analysis
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3114741/
https://www.ncbi.nlm.nih.gov/pubmed/21526987
http://dx.doi.org/10.1186/1471-2105-12-124
work_keys_str_mv AT lechnermarcus proteinorthodetectionofcoorthologsinlargescaleanalysis
AT findeißsven proteinorthodetectionofcoorthologsinlargescaleanalysis
AT steinerlydia proteinorthodetectionofcoorthologsinlargescaleanalysis
AT marzmanja proteinorthodetectionofcoorthologsinlargescaleanalysis
AT stadlerpeterf proteinorthodetectionofcoorthologsinlargescaleanalysis
AT prohaskasonjaj proteinorthodetectionofcoorthologsinlargescaleanalysis