Cargando…
DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks
Automated protein function prediction is critical for the annotation of uncharacterized protein sequences, where accurate prediction methods are still required. Recently, deep learning based methods have outperformed conventional algorithms in computer vision and natural language processing due to t...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6517386/ https://www.ncbi.nlm.nih.gov/pubmed/31089211 http://dx.doi.org/10.1038/s41598-019-43708-3 |
_version_ | 1783418265363546112 |
---|---|
author | Sureyya Rifaioglu, Ahmet Doğan, Tunca Jesus Martin, Maria Cetin-Atalay, Rengul Atalay, Volkan |
author_facet | Sureyya Rifaioglu, Ahmet Doğan, Tunca Jesus Martin, Maria Cetin-Atalay, Rengul Atalay, Volkan |
author_sort | Sureyya Rifaioglu, Ahmet |
collection | PubMed |
description | Automated protein function prediction is critical for the annotation of uncharacterized protein sequences, where accurate prediction methods are still required. Recently, deep learning based methods have outperformed conventional algorithms in computer vision and natural language processing due to the prevention of overfitting and efficient training. Here, we propose DEEPred, a hierarchical stack of multi-task feed-forward deep neural networks, as a solution to Gene Ontology (GO) based protein function prediction. DEEPred was optimized through rigorous hyper-parameter tests, and benchmarked using three types of protein descriptors, training datasets with varying sizes and GO terms form different levels. Furthermore, in order to explore how training with larger but potentially noisy data would change the performance, electronically made GO annotations were also included in the training process. The overall predictive performance of DEEPred was assessed using CAFA2 and CAFA3 challenge datasets, in comparison with the state-of-the-art protein function prediction methods. Finally, we evaluated selected novel annotations produced by DEEPred with a literature-based case study considering the ‘biofilm formation process’ in Pseudomonas aeruginosa. This study reports that deep learning algorithms have significant potential in protein function prediction; particularly when the source data is large. The neural network architecture of DEEPred can also be applied to the prediction of the other types of ontological associations. The source code and all datasets used in this study are available at: https://github.com/cansyl/DEEPred. |
format | Online Article Text |
id | pubmed-6517386 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-65173862019-05-24 DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks Sureyya Rifaioglu, Ahmet Doğan, Tunca Jesus Martin, Maria Cetin-Atalay, Rengul Atalay, Volkan Sci Rep Article Automated protein function prediction is critical for the annotation of uncharacterized protein sequences, where accurate prediction methods are still required. Recently, deep learning based methods have outperformed conventional algorithms in computer vision and natural language processing due to the prevention of overfitting and efficient training. Here, we propose DEEPred, a hierarchical stack of multi-task feed-forward deep neural networks, as a solution to Gene Ontology (GO) based protein function prediction. DEEPred was optimized through rigorous hyper-parameter tests, and benchmarked using three types of protein descriptors, training datasets with varying sizes and GO terms form different levels. Furthermore, in order to explore how training with larger but potentially noisy data would change the performance, electronically made GO annotations were also included in the training process. The overall predictive performance of DEEPred was assessed using CAFA2 and CAFA3 challenge datasets, in comparison with the state-of-the-art protein function prediction methods. Finally, we evaluated selected novel annotations produced by DEEPred with a literature-based case study considering the ‘biofilm formation process’ in Pseudomonas aeruginosa. This study reports that deep learning algorithms have significant potential in protein function prediction; particularly when the source data is large. The neural network architecture of DEEPred can also be applied to the prediction of the other types of ontological associations. The source code and all datasets used in this study are available at: https://github.com/cansyl/DEEPred. Nature Publishing Group UK 2019-05-14 /pmc/articles/PMC6517386/ /pubmed/31089211 http://dx.doi.org/10.1038/s41598-019-43708-3 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Sureyya Rifaioglu, Ahmet Doğan, Tunca Jesus Martin, Maria Cetin-Atalay, Rengul Atalay, Volkan DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks |
title | DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks |
title_full | DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks |
title_fullStr | DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks |
title_full_unstemmed | DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks |
title_short | DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks |
title_sort | deepred: automated protein function prediction with multi-task feed-forward deep neural networks |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6517386/ https://www.ncbi.nlm.nih.gov/pubmed/31089211 http://dx.doi.org/10.1038/s41598-019-43708-3 |
work_keys_str_mv | AT sureyyarifaiogluahmet deepredautomatedproteinfunctionpredictionwithmultitaskfeedforwarddeepneuralnetworks AT dogantunca deepredautomatedproteinfunctionpredictionwithmultitaskfeedforwarddeepneuralnetworks AT jesusmartinmaria deepredautomatedproteinfunctionpredictionwithmultitaskfeedforwarddeepneuralnetworks AT cetinatalayrengul deepredautomatedproteinfunctionpredictionwithmultitaskfeedforwarddeepneuralnetworks AT atalayvolkan deepredautomatedproteinfunctionpredictionwithmultitaskfeedforwarddeepneuralnetworks |