Cargando…

DeepNOG: fast and accurate protein orthologous group assignment

MOTIVATION: Protein orthologous group databases are powerful tools for evolutionary analysis, functional annotation or metabolic pathway modeling across lineages. Sequences are typically assigned to orthologous groups with alignment-based methods, such as profile hidden Markov models, which have bec...

Descripción completa

Detalles Bibliográficos
Autores principales: Feldbauer, Roman, Gosch, Lukas, Lüftinger, Lukas, Hyden, Patrick, Flexer, Arthur, Rattei, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8016488/
https://www.ncbi.nlm.nih.gov/pubmed/33367584
http://dx.doi.org/10.1093/bioinformatics/btaa1051
_version_ 1783673870153154560
author Feldbauer, Roman
Gosch, Lukas
Lüftinger, Lukas
Hyden, Patrick
Flexer, Arthur
Rattei, Thomas
author_facet Feldbauer, Roman
Gosch, Lukas
Lüftinger, Lukas
Hyden, Patrick
Flexer, Arthur
Rattei, Thomas
author_sort Feldbauer, Roman
collection PubMed
description MOTIVATION: Protein orthologous group databases are powerful tools for evolutionary analysis, functional annotation or metabolic pathway modeling across lineages. Sequences are typically assigned to orthologous groups with alignment-based methods, such as profile hidden Markov models, which have become a computational bottleneck. RESULTS: We present DeepNOG, an extremely fast and accurate, alignment-free orthology assignment method based on deep convolutional networks. We compare DeepNOG against state-of-the-art alignment-based (HMMER, DIAMOND) and alignment-free methods (DeepFam) on two orthology databases (COG, eggNOG 5). DeepNOG can be scaled to large orthology databases like eggNOG, for which it outperforms DeepFam in terms of precision and recall by large margins. While alignment-based methods still provide the most accurate assignments among the investigated methods, computing time of DeepNOG is an order of magnitude lower on CPUs. Optional GPU usage further increases throughput massively. A command-line tool enables rapid adoption by users. AVAILABILITYAND IMPLEMENTATION: Source code and packages are freely available at https://github.com/univieCUBE/deepnog. Install the platform-independent Python program with $pip install deepnog. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8016488
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-80164882021-04-07 DeepNOG: fast and accurate protein orthologous group assignment Feldbauer, Roman Gosch, Lukas Lüftinger, Lukas Hyden, Patrick Flexer, Arthur Rattei, Thomas Bioinformatics Original Papers MOTIVATION: Protein orthologous group databases are powerful tools for evolutionary analysis, functional annotation or metabolic pathway modeling across lineages. Sequences are typically assigned to orthologous groups with alignment-based methods, such as profile hidden Markov models, which have become a computational bottleneck. RESULTS: We present DeepNOG, an extremely fast and accurate, alignment-free orthology assignment method based on deep convolutional networks. We compare DeepNOG against state-of-the-art alignment-based (HMMER, DIAMOND) and alignment-free methods (DeepFam) on two orthology databases (COG, eggNOG 5). DeepNOG can be scaled to large orthology databases like eggNOG, for which it outperforms DeepFam in terms of precision and recall by large margins. While alignment-based methods still provide the most accurate assignments among the investigated methods, computing time of DeepNOG is an order of magnitude lower on CPUs. Optional GPU usage further increases throughput massively. A command-line tool enables rapid adoption by users. AVAILABILITYAND IMPLEMENTATION: Source code and packages are freely available at https://github.com/univieCUBE/deepnog. Install the platform-independent Python program with $pip install deepnog. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-12-26 /pmc/articles/PMC8016488/ /pubmed/33367584 http://dx.doi.org/10.1093/bioinformatics/btaa1051 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Feldbauer, Roman
Gosch, Lukas
Lüftinger, Lukas
Hyden, Patrick
Flexer, Arthur
Rattei, Thomas
DeepNOG: fast and accurate protein orthologous group assignment
title DeepNOG: fast and accurate protein orthologous group assignment
title_full DeepNOG: fast and accurate protein orthologous group assignment
title_fullStr DeepNOG: fast and accurate protein orthologous group assignment
title_full_unstemmed DeepNOG: fast and accurate protein orthologous group assignment
title_short DeepNOG: fast and accurate protein orthologous group assignment
title_sort deepnog: fast and accurate protein orthologous group assignment
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8016488/
https://www.ncbi.nlm.nih.gov/pubmed/33367584
http://dx.doi.org/10.1093/bioinformatics/btaa1051
work_keys_str_mv AT feldbauerroman deepnogfastandaccurateproteinorthologousgroupassignment
AT goschlukas deepnogfastandaccurateproteinorthologousgroupassignment
AT luftingerlukas deepnogfastandaccurateproteinorthologousgroupassignment
AT hydenpatrick deepnogfastandaccurateproteinorthologousgroupassignment
AT flexerarthur deepnogfastandaccurateproteinorthologousgroupassignment
AT ratteithomas deepnogfastandaccurateproteinorthologousgroupassignment