Cargando…

ProteInfer, deep neural networks for protein functional inference

Predicting the function of a protein from its amino acid sequence is a long-standing challenge in bioinformatics. Traditional approaches use sequence alignment to compare a query sequence either to thousands of models of protein families or to large databases of individual protein sequences. Here we...

Descripción completa

Detalles Bibliográficos
Autores principales: Sanderson, Theo, Bileschi, Maxwell L, Belanger, David, Colwell, Lucy J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: eLife Sciences Publications, Ltd 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10063232/
https://www.ncbi.nlm.nih.gov/pubmed/36847334
http://dx.doi.org/10.7554/eLife.80942
_version_ 1785017666890629120
author Sanderson, Theo
Bileschi, Maxwell L
Belanger, David
Colwell, Lucy J
author_facet Sanderson, Theo
Bileschi, Maxwell L
Belanger, David
Colwell, Lucy J
author_sort Sanderson, Theo
collection PubMed
description Predicting the function of a protein from its amino acid sequence is a long-standing challenge in bioinformatics. Traditional approaches use sequence alignment to compare a query sequence either to thousands of models of protein families or to large databases of individual protein sequences. Here we introduce ProteInfer, which instead employs deep convolutional neural networks to directly predict a variety of protein functions – Enzyme Commission (EC) numbers and Gene Ontology (GO) terms – directly from an unaligned amino acid sequence. This approach provides precise predictions which complement alignment-based methods, and the computational efficiency of a single neural network permits novel and lightweight software interfaces, which we demonstrate with an in-browser graphical interface for protein function prediction in which all computation is performed on the user’s personal computer with no data uploaded to remote servers. Moreover, these models place full-length amino acid sequences into a generalised functional space, facilitating downstream analysis and interpretation. To read the interactive version of this paper, please visit https://google-research.github.io/proteinfer/.
format Online
Article
Text
id pubmed-10063232
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher eLife Sciences Publications, Ltd
record_format MEDLINE/PubMed
spelling pubmed-100632322023-03-31 ProteInfer, deep neural networks for protein functional inference Sanderson, Theo Bileschi, Maxwell L Belanger, David Colwell, Lucy J eLife Computational and Systems Biology Predicting the function of a protein from its amino acid sequence is a long-standing challenge in bioinformatics. Traditional approaches use sequence alignment to compare a query sequence either to thousands of models of protein families or to large databases of individual protein sequences. Here we introduce ProteInfer, which instead employs deep convolutional neural networks to directly predict a variety of protein functions – Enzyme Commission (EC) numbers and Gene Ontology (GO) terms – directly from an unaligned amino acid sequence. This approach provides precise predictions which complement alignment-based methods, and the computational efficiency of a single neural network permits novel and lightweight software interfaces, which we demonstrate with an in-browser graphical interface for protein function prediction in which all computation is performed on the user’s personal computer with no data uploaded to remote servers. Moreover, these models place full-length amino acid sequences into a generalised functional space, facilitating downstream analysis and interpretation. To read the interactive version of this paper, please visit https://google-research.github.io/proteinfer/. eLife Sciences Publications, Ltd 2023-02-27 /pmc/articles/PMC10063232/ /pubmed/36847334 http://dx.doi.org/10.7554/eLife.80942 Text en © 2023, Sanderson, Bileschi et al https://creativecommons.org/licenses/by/4.0/This article is distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use and redistribution provided that the original author and source are credited.
spellingShingle Computational and Systems Biology
Sanderson, Theo
Bileschi, Maxwell L
Belanger, David
Colwell, Lucy J
ProteInfer, deep neural networks for protein functional inference
title ProteInfer, deep neural networks for protein functional inference
title_full ProteInfer, deep neural networks for protein functional inference
title_fullStr ProteInfer, deep neural networks for protein functional inference
title_full_unstemmed ProteInfer, deep neural networks for protein functional inference
title_short ProteInfer, deep neural networks for protein functional inference
title_sort proteinfer, deep neural networks for protein functional inference
topic Computational and Systems Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10063232/
https://www.ncbi.nlm.nih.gov/pubmed/36847334
http://dx.doi.org/10.7554/eLife.80942
work_keys_str_mv AT sandersontheo proteinferdeepneuralnetworksforproteinfunctionalinference
AT bileschimaxwelll proteinferdeepneuralnetworksforproteinfunctionalinference
AT belangerdavid proteinferdeepneuralnetworksforproteinfunctionalinference
AT colwelllucyj proteinferdeepneuralnetworksforproteinfunctionalinference