Cargando…

DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks

Automated protein function prediction is critical for the annotation of uncharacterized protein sequences, where accurate prediction methods are still required. Recently, deep learning based methods have outperformed conventional algorithms in computer vision and natural language processing due to t...

Descripción completa

Detalles Bibliográficos
Autores principales: Sureyya Rifaioglu, Ahmet, Doğan, Tunca, Jesus Martin, Maria, Cetin-Atalay, Rengul, Atalay, Volkan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6517386/
https://www.ncbi.nlm.nih.gov/pubmed/31089211
http://dx.doi.org/10.1038/s41598-019-43708-3
_version_ 1783418265363546112
author Sureyya Rifaioglu, Ahmet
Doğan, Tunca
Jesus Martin, Maria
Cetin-Atalay, Rengul
Atalay, Volkan
author_facet Sureyya Rifaioglu, Ahmet
Doğan, Tunca
Jesus Martin, Maria
Cetin-Atalay, Rengul
Atalay, Volkan
author_sort Sureyya Rifaioglu, Ahmet
collection PubMed
description Automated protein function prediction is critical for the annotation of uncharacterized protein sequences, where accurate prediction methods are still required. Recently, deep learning based methods have outperformed conventional algorithms in computer vision and natural language processing due to the prevention of overfitting and efficient training. Here, we propose DEEPred, a hierarchical stack of multi-task feed-forward deep neural networks, as a solution to Gene Ontology (GO) based protein function prediction. DEEPred was optimized through rigorous hyper-parameter tests, and benchmarked using three types of protein descriptors, training datasets with varying sizes and GO terms form different levels. Furthermore, in order to explore how training with larger but potentially noisy data would change the performance, electronically made GO annotations were also included in the training process. The overall predictive performance of DEEPred was assessed using CAFA2 and CAFA3 challenge datasets, in comparison with the state-of-the-art protein function prediction methods. Finally, we evaluated selected novel annotations produced by DEEPred with a literature-based case study considering the ‘biofilm formation process’ in Pseudomonas aeruginosa. This study reports that deep learning algorithms have significant potential in protein function prediction; particularly when the source data is large. The neural network architecture of DEEPred can also be applied to the prediction of the other types of ontological associations. The source code and all datasets used in this study are available at: https://github.com/cansyl/DEEPred.
format Online
Article
Text
id pubmed-6517386
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-65173862019-05-24 DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks Sureyya Rifaioglu, Ahmet Doğan, Tunca Jesus Martin, Maria Cetin-Atalay, Rengul Atalay, Volkan Sci Rep Article Automated protein function prediction is critical for the annotation of uncharacterized protein sequences, where accurate prediction methods are still required. Recently, deep learning based methods have outperformed conventional algorithms in computer vision and natural language processing due to the prevention of overfitting and efficient training. Here, we propose DEEPred, a hierarchical stack of multi-task feed-forward deep neural networks, as a solution to Gene Ontology (GO) based protein function prediction. DEEPred was optimized through rigorous hyper-parameter tests, and benchmarked using three types of protein descriptors, training datasets with varying sizes and GO terms form different levels. Furthermore, in order to explore how training with larger but potentially noisy data would change the performance, electronically made GO annotations were also included in the training process. The overall predictive performance of DEEPred was assessed using CAFA2 and CAFA3 challenge datasets, in comparison with the state-of-the-art protein function prediction methods. Finally, we evaluated selected novel annotations produced by DEEPred with a literature-based case study considering the ‘biofilm formation process’ in Pseudomonas aeruginosa. This study reports that deep learning algorithms have significant potential in protein function prediction; particularly when the source data is large. The neural network architecture of DEEPred can also be applied to the prediction of the other types of ontological associations. The source code and all datasets used in this study are available at: https://github.com/cansyl/DEEPred. Nature Publishing Group UK 2019-05-14 /pmc/articles/PMC6517386/ /pubmed/31089211 http://dx.doi.org/10.1038/s41598-019-43708-3 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Sureyya Rifaioglu, Ahmet
Doğan, Tunca
Jesus Martin, Maria
Cetin-Atalay, Rengul
Atalay, Volkan
DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks
title DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks
title_full DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks
title_fullStr DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks
title_full_unstemmed DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks
title_short DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks
title_sort deepred: automated protein function prediction with multi-task feed-forward deep neural networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6517386/
https://www.ncbi.nlm.nih.gov/pubmed/31089211
http://dx.doi.org/10.1038/s41598-019-43708-3
work_keys_str_mv AT sureyyarifaiogluahmet deepredautomatedproteinfunctionpredictionwithmultitaskfeedforwarddeepneuralnetworks
AT dogantunca deepredautomatedproteinfunctionpredictionwithmultitaskfeedforwarddeepneuralnetworks
AT jesusmartinmaria deepredautomatedproteinfunctionpredictionwithmultitaskfeedforwarddeepneuralnetworks
AT cetinatalayrengul deepredautomatedproteinfunctionpredictionwithmultitaskfeedforwarddeepneuralnetworks
AT atalayvolkan deepredautomatedproteinfunctionpredictionwithmultitaskfeedforwarddeepneuralnetworks