Cargando…

Hierarchical deep learning for predicting GO annotations by integrating protein knowledge

MOTIVATION: Experimental testing and manual curation are the most precise ways for assigning Gene Ontology (GO) terms describing protein functions. However, they are expensive, time-consuming and cannot cope with the exponential growth of data generated by high-throughput sequencing methods. Hence,...

Descripción completa

Detalles Bibliográficos
Autores principales: Merino, Gabriela A, Saidi, Rabie, Milone, Diego H, Stegmayer, Georgina, Martin, Maria J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9524999/
https://www.ncbi.nlm.nih.gov/pubmed/35929781
http://dx.doi.org/10.1093/bioinformatics/btac536
_version_ 1784800613171724288
author Merino, Gabriela A
Saidi, Rabie
Milone, Diego H
Stegmayer, Georgina
Martin, Maria J
author_facet Merino, Gabriela A
Saidi, Rabie
Milone, Diego H
Stegmayer, Georgina
Martin, Maria J
author_sort Merino, Gabriela A
collection PubMed
description MOTIVATION: Experimental testing and manual curation are the most precise ways for assigning Gene Ontology (GO) terms describing protein functions. However, they are expensive, time-consuming and cannot cope with the exponential growth of data generated by high-throughput sequencing methods. Hence, researchers need reliable computational systems to help fill the gap with automatic function prediction. The results of the last Critical Assessment of Function Annotation challenge revealed that GO-terms prediction remains a very challenging task. Recent developments on deep learning are significantly breaking out the frontiers leading to new knowledge in protein research thanks to the integration of data from multiple sources. However, deep models hitherto developed for functional prediction are mainly focused on sequence data and have not achieved breakthrough performances yet. RESULTS: We propose DeeProtGO, a novel deep-learning model for predicting GO annotations by integrating protein knowledge. DeeProtGO was trained for solving 18 different prediction problems, defined by the three GO sub-ontologies, the type of proteins, and the taxonomic kingdom. Our experiments reported higher prediction quality when more protein knowledge is integrated. We also benchmarked DeeProtGO against state-of-the-art methods on public datasets, and showed it can effectively improve the prediction of GO annotations. AVAILABILITY AND IMPLEMENTATION: DeeProtGO and a case of use are available at https://github.com/gamerino/DeeProtGO. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9524999
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-95249992022-10-03 Hierarchical deep learning for predicting GO annotations by integrating protein knowledge Merino, Gabriela A Saidi, Rabie Milone, Diego H Stegmayer, Georgina Martin, Maria J Bioinformatics Original Papers MOTIVATION: Experimental testing and manual curation are the most precise ways for assigning Gene Ontology (GO) terms describing protein functions. However, they are expensive, time-consuming and cannot cope with the exponential growth of data generated by high-throughput sequencing methods. Hence, researchers need reliable computational systems to help fill the gap with automatic function prediction. The results of the last Critical Assessment of Function Annotation challenge revealed that GO-terms prediction remains a very challenging task. Recent developments on deep learning are significantly breaking out the frontiers leading to new knowledge in protein research thanks to the integration of data from multiple sources. However, deep models hitherto developed for functional prediction are mainly focused on sequence data and have not achieved breakthrough performances yet. RESULTS: We propose DeeProtGO, a novel deep-learning model for predicting GO annotations by integrating protein knowledge. DeeProtGO was trained for solving 18 different prediction problems, defined by the three GO sub-ontologies, the type of proteins, and the taxonomic kingdom. Our experiments reported higher prediction quality when more protein knowledge is integrated. We also benchmarked DeeProtGO against state-of-the-art methods on public datasets, and showed it can effectively improve the prediction of GO annotations. AVAILABILITY AND IMPLEMENTATION: DeeProtGO and a case of use are available at https://github.com/gamerino/DeeProtGO. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-08-05 /pmc/articles/PMC9524999/ /pubmed/35929781 http://dx.doi.org/10.1093/bioinformatics/btac536 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Merino, Gabriela A
Saidi, Rabie
Milone, Diego H
Stegmayer, Georgina
Martin, Maria J
Hierarchical deep learning for predicting GO annotations by integrating protein knowledge
title Hierarchical deep learning for predicting GO annotations by integrating protein knowledge
title_full Hierarchical deep learning for predicting GO annotations by integrating protein knowledge
title_fullStr Hierarchical deep learning for predicting GO annotations by integrating protein knowledge
title_full_unstemmed Hierarchical deep learning for predicting GO annotations by integrating protein knowledge
title_short Hierarchical deep learning for predicting GO annotations by integrating protein knowledge
title_sort hierarchical deep learning for predicting go annotations by integrating protein knowledge
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9524999/
https://www.ncbi.nlm.nih.gov/pubmed/35929781
http://dx.doi.org/10.1093/bioinformatics/btac536
work_keys_str_mv AT merinogabrielaa hierarchicaldeeplearningforpredictinggoannotationsbyintegratingproteinknowledge
AT saidirabie hierarchicaldeeplearningforpredictinggoannotationsbyintegratingproteinknowledge
AT milonediegoh hierarchicaldeeplearningforpredictinggoannotationsbyintegratingproteinknowledge
AT stegmayergeorgina hierarchicaldeeplearningforpredictinggoannotationsbyintegratingproteinknowledge
AT martinmariaj hierarchicaldeeplearningforpredictinggoannotationsbyintegratingproteinknowledge