Cargando…

DeepGOPlus: improved protein function prediction from sequence

MOTIVATION: Protein function prediction is one of the major tasks of bioinformatics that can help in wide range of biological problems such as understanding disease mechanisms or finding drug targets. Many methods are available for predicting protein functions from sequence based features, protein–p...

Descripción completa

Detalles Bibliográficos
Autores principales: Kulmanov, Maxat, Hoehndorf, Robert
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9883727/
https://www.ncbi.nlm.nih.gov/pubmed/31350877
http://dx.doi.org/10.1093/bioinformatics/btz595
_version_ 1784879566668431360
author Kulmanov, Maxat
Hoehndorf, Robert
author_facet Kulmanov, Maxat
Hoehndorf, Robert
author_sort Kulmanov, Maxat
collection PubMed
description MOTIVATION: Protein function prediction is one of the major tasks of bioinformatics that can help in wide range of biological problems such as understanding disease mechanisms or finding drug targets. Many methods are available for predicting protein functions from sequence based features, protein–protein interaction networks, protein structure or literature. However, other than sequence, most of the features are difficult to obtain or not available for many proteins thereby limiting their scope. Furthermore, the performance of sequence-based function prediction methods is often lower than methods that incorporate multiple features and predicting protein functions may require a lot of time. RESULTS: We developed a novel method for predicting protein functions from sequence alone which combines deep convolutional neural network (CNN) model with sequence similarity based predictions. Our CNN model scans the sequence for motifs which are predictive for protein functions and combines this with functions of similar proteins (if available). We evaluate the performance of DeepGOPlus using the CAFA3 evaluation measures and achieve an F(max) of 0.390, 0.557 and 0.614 for BPO, MFO and CCO evaluations, respectively. These results would have made DeepGOPlus one of the three best predictors in CCO and the second best performing method in the BPO and MFO evaluations. We also compare DeepGOPlus with state-of-the-art methods such as DeepText2GO and GOLabeler on another dataset. DeepGOPlus can annotate around 40 protein sequences per second on common hardware, thereby making fast and accurate function predictions available for a wide range of proteins. AVAILABILITY AND IMPLEMENTATION: http://deepgoplus.bio2vec.net/ . SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9883727
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98837272023-02-01 DeepGOPlus: improved protein function prediction from sequence Kulmanov, Maxat Hoehndorf, Robert Bioinformatics Original Papers MOTIVATION: Protein function prediction is one of the major tasks of bioinformatics that can help in wide range of biological problems such as understanding disease mechanisms or finding drug targets. Many methods are available for predicting protein functions from sequence based features, protein–protein interaction networks, protein structure or literature. However, other than sequence, most of the features are difficult to obtain or not available for many proteins thereby limiting their scope. Furthermore, the performance of sequence-based function prediction methods is often lower than methods that incorporate multiple features and predicting protein functions may require a lot of time. RESULTS: We developed a novel method for predicting protein functions from sequence alone which combines deep convolutional neural network (CNN) model with sequence similarity based predictions. Our CNN model scans the sequence for motifs which are predictive for protein functions and combines this with functions of similar proteins (if available). We evaluate the performance of DeepGOPlus using the CAFA3 evaluation measures and achieve an F(max) of 0.390, 0.557 and 0.614 for BPO, MFO and CCO evaluations, respectively. These results would have made DeepGOPlus one of the three best predictors in CCO and the second best performing method in the BPO and MFO evaluations. We also compare DeepGOPlus with state-of-the-art methods such as DeepText2GO and GOLabeler on another dataset. DeepGOPlus can annotate around 40 protein sequences per second on common hardware, thereby making fast and accurate function predictions available for a wide range of proteins. AVAILABILITY AND IMPLEMENTATION: http://deepgoplus.bio2vec.net/ . SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-07-27 /pmc/articles/PMC9883727/ /pubmed/31350877 http://dx.doi.org/10.1093/bioinformatics/btz595 Text en © The Author(s) 2019. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Kulmanov, Maxat
Hoehndorf, Robert
DeepGOPlus: improved protein function prediction from sequence
title DeepGOPlus: improved protein function prediction from sequence
title_full DeepGOPlus: improved protein function prediction from sequence
title_fullStr DeepGOPlus: improved protein function prediction from sequence
title_full_unstemmed DeepGOPlus: improved protein function prediction from sequence
title_short DeepGOPlus: improved protein function prediction from sequence
title_sort deepgoplus: improved protein function prediction from sequence
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9883727/
https://www.ncbi.nlm.nih.gov/pubmed/31350877
http://dx.doi.org/10.1093/bioinformatics/btz595
work_keys_str_mv AT kulmanovmaxat deepgoplusimprovedproteinfunctionpredictionfromsequence
AT hoehndorfrobert deepgoplusimprovedproteinfunctionpredictionfromsequence