Cargando…

Orthology Detection Combining Clustering and Synteny for Very Large Datasets

The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too hig...

Descripción completa

Detalles Bibliográficos
Autores principales: Lechner, Marcus, Hernandez-Rosales, Maribel, Doerr, Daniel, Wieseke, Nicolas, Thévenin, Annelyse, Stoye, Jens, Hartmann, Roland K., Prohaska, Sonja J., Stadler, Peter F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4138177/
https://www.ncbi.nlm.nih.gov/pubmed/25137074
http://dx.doi.org/10.1371/journal.pone.0105015
_version_ 1782331205759270912
author Lechner, Marcus
Hernandez-Rosales, Maribel
Doerr, Daniel
Wieseke, Nicolas
Thévenin, Annelyse
Stoye, Jens
Hartmann, Roland K.
Prohaska, Sonja J.
Stadler, Peter F.
author_facet Lechner, Marcus
Hernandez-Rosales, Maribel
Doerr, Daniel
Wieseke, Nicolas
Thévenin, Annelyse
Stoye, Jens
Hartmann, Roland K.
Prohaska, Sonja J.
Stadler, Peter F.
author_sort Lechner, Marcus
collection PubMed
description The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets.
format Online
Article
Text
id pubmed-4138177
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-41381772014-08-20 Orthology Detection Combining Clustering and Synteny for Very Large Datasets Lechner, Marcus Hernandez-Rosales, Maribel Doerr, Daniel Wieseke, Nicolas Thévenin, Annelyse Stoye, Jens Hartmann, Roland K. Prohaska, Sonja J. Stadler, Peter F. PLoS One Research Article The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets. Public Library of Science 2014-08-19 /pmc/articles/PMC4138177/ /pubmed/25137074 http://dx.doi.org/10.1371/journal.pone.0105015 Text en © 2014 Lechner et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Lechner, Marcus
Hernandez-Rosales, Maribel
Doerr, Daniel
Wieseke, Nicolas
Thévenin, Annelyse
Stoye, Jens
Hartmann, Roland K.
Prohaska, Sonja J.
Stadler, Peter F.
Orthology Detection Combining Clustering and Synteny for Very Large Datasets
title Orthology Detection Combining Clustering and Synteny for Very Large Datasets
title_full Orthology Detection Combining Clustering and Synteny for Very Large Datasets
title_fullStr Orthology Detection Combining Clustering and Synteny for Very Large Datasets
title_full_unstemmed Orthology Detection Combining Clustering and Synteny for Very Large Datasets
title_short Orthology Detection Combining Clustering and Synteny for Very Large Datasets
title_sort orthology detection combining clustering and synteny for very large datasets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4138177/
https://www.ncbi.nlm.nih.gov/pubmed/25137074
http://dx.doi.org/10.1371/journal.pone.0105015
work_keys_str_mv AT lechnermarcus orthologydetectioncombiningclusteringandsyntenyforverylargedatasets
AT hernandezrosalesmaribel orthologydetectioncombiningclusteringandsyntenyforverylargedatasets
AT doerrdaniel orthologydetectioncombiningclusteringandsyntenyforverylargedatasets
AT wiesekenicolas orthologydetectioncombiningclusteringandsyntenyforverylargedatasets
AT theveninannelyse orthologydetectioncombiningclusteringandsyntenyforverylargedatasets
AT stoyejens orthologydetectioncombiningclusteringandsyntenyforverylargedatasets
AT hartmannrolandk orthologydetectioncombiningclusteringandsyntenyforverylargedatasets
AT prohaskasonjaj orthologydetectioncombiningclusteringandsyntenyforverylargedatasets
AT stadlerpeterf orthologydetectioncombiningclusteringandsyntenyforverylargedatasets