Cargando…
Orthology Detection Combining Clustering and Synteny for Very Large Datasets
The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too hig...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4138177/ https://www.ncbi.nlm.nih.gov/pubmed/25137074 http://dx.doi.org/10.1371/journal.pone.0105015 |
_version_ | 1782331205759270912 |
---|---|
author | Lechner, Marcus Hernandez-Rosales, Maribel Doerr, Daniel Wieseke, Nicolas Thévenin, Annelyse Stoye, Jens Hartmann, Roland K. Prohaska, Sonja J. Stadler, Peter F. |
author_facet | Lechner, Marcus Hernandez-Rosales, Maribel Doerr, Daniel Wieseke, Nicolas Thévenin, Annelyse Stoye, Jens Hartmann, Roland K. Prohaska, Sonja J. Stadler, Peter F. |
author_sort | Lechner, Marcus |
collection | PubMed |
description | The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets. |
format | Online Article Text |
id | pubmed-4138177 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-41381772014-08-20 Orthology Detection Combining Clustering and Synteny for Very Large Datasets Lechner, Marcus Hernandez-Rosales, Maribel Doerr, Daniel Wieseke, Nicolas Thévenin, Annelyse Stoye, Jens Hartmann, Roland K. Prohaska, Sonja J. Stadler, Peter F. PLoS One Research Article The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets. Public Library of Science 2014-08-19 /pmc/articles/PMC4138177/ /pubmed/25137074 http://dx.doi.org/10.1371/journal.pone.0105015 Text en © 2014 Lechner et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Lechner, Marcus Hernandez-Rosales, Maribel Doerr, Daniel Wieseke, Nicolas Thévenin, Annelyse Stoye, Jens Hartmann, Roland K. Prohaska, Sonja J. Stadler, Peter F. Orthology Detection Combining Clustering and Synteny for Very Large Datasets |
title | Orthology Detection Combining Clustering and Synteny for Very Large Datasets |
title_full | Orthology Detection Combining Clustering and Synteny for Very Large Datasets |
title_fullStr | Orthology Detection Combining Clustering and Synteny for Very Large Datasets |
title_full_unstemmed | Orthology Detection Combining Clustering and Synteny for Very Large Datasets |
title_short | Orthology Detection Combining Clustering and Synteny for Very Large Datasets |
title_sort | orthology detection combining clustering and synteny for very large datasets |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4138177/ https://www.ncbi.nlm.nih.gov/pubmed/25137074 http://dx.doi.org/10.1371/journal.pone.0105015 |
work_keys_str_mv | AT lechnermarcus orthologydetectioncombiningclusteringandsyntenyforverylargedatasets AT hernandezrosalesmaribel orthologydetectioncombiningclusteringandsyntenyforverylargedatasets AT doerrdaniel orthologydetectioncombiningclusteringandsyntenyforverylargedatasets AT wiesekenicolas orthologydetectioncombiningclusteringandsyntenyforverylargedatasets AT theveninannelyse orthologydetectioncombiningclusteringandsyntenyforverylargedatasets AT stoyejens orthologydetectioncombiningclusteringandsyntenyforverylargedatasets AT hartmannrolandk orthologydetectioncombiningclusteringandsyntenyforverylargedatasets AT prohaskasonjaj orthologydetectioncombiningclusteringandsyntenyforverylargedatasets AT stadlerpeterf orthologydetectioncombiningclusteringandsyntenyforverylargedatasets |