Cargando…

Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations

Domains are functional and structural units of proteins that govern various biological functions performed by the proteins. Therefore, the characterization of domains in a protein can serve as a proper functional representation of proteins. Here, we employ a self-supervised protocol to derive functi...

Descripción completa

Detalles Bibliográficos
Autores principales: Ibtehaz, Nabil, Kagaya, Yuki, Kihara, Daisuke
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10473699/
https://www.ncbi.nlm.nih.gov/pubmed/37662252
http://dx.doi.org/10.1101/2023.08.23.554486
_version_ 1785100322416361472
author Ibtehaz, Nabil
Kagaya, Yuki
Kihara, Daisuke
author_facet Ibtehaz, Nabil
Kagaya, Yuki
Kihara, Daisuke
author_sort Ibtehaz, Nabil
collection PubMed
description Domains are functional and structural units of proteins that govern various biological functions performed by the proteins. Therefore, the characterization of domains in a protein can serve as a proper functional representation of proteins. Here, we employ a self-supervised protocol to derive functionally consistent representations for domains by learning domain-Gene Ontology (GO) co-occurrences and associations. The domain embeddings we constructed turned out to be effective in performing actual function prediction tasks. Extensive evaluations showed that protein representations using the domain embeddings are superior to those of large-scale protein language models in GO prediction tasks. Moreover, the new function prediction method built on the domain embeddings, named Domain-PFP, significantly outperformed the state-of-the-art function predictors. Additionally, Domain-PFP demonstrated competitive performance in the CAFA3 evaluation, achieving overall the best performance among the top teams that participated in the assessment.
format Online
Article
Text
id pubmed-10473699
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-104736992023-09-02 Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations Ibtehaz, Nabil Kagaya, Yuki Kihara, Daisuke bioRxiv Article Domains are functional and structural units of proteins that govern various biological functions performed by the proteins. Therefore, the characterization of domains in a protein can serve as a proper functional representation of proteins. Here, we employ a self-supervised protocol to derive functionally consistent representations for domains by learning domain-Gene Ontology (GO) co-occurrences and associations. The domain embeddings we constructed turned out to be effective in performing actual function prediction tasks. Extensive evaluations showed that protein representations using the domain embeddings are superior to those of large-scale protein language models in GO prediction tasks. Moreover, the new function prediction method built on the domain embeddings, named Domain-PFP, significantly outperformed the state-of-the-art function predictors. Additionally, Domain-PFP demonstrated competitive performance in the CAFA3 evaluation, achieving overall the best performance among the top teams that participated in the assessment. Cold Spring Harbor Laboratory 2023-08-24 /pmc/articles/PMC10473699/ /pubmed/37662252 http://dx.doi.org/10.1101/2023.08.23.554486 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Ibtehaz, Nabil
Kagaya, Yuki
Kihara, Daisuke
Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations
title Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations
title_full Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations
title_fullStr Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations
title_full_unstemmed Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations
title_short Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations
title_sort domain-pfp: protein function prediction using function-aware domain embedding representations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10473699/
https://www.ncbi.nlm.nih.gov/pubmed/37662252
http://dx.doi.org/10.1101/2023.08.23.554486
work_keys_str_mv AT ibtehaznabil domainpfpproteinfunctionpredictionusingfunctionawaredomainembeddingrepresentations
AT kagayayuki domainpfpproteinfunctionpredictionusingfunctionawaredomainembeddingrepresentations
AT kiharadaisuke domainpfpproteinfunctionpredictionusingfunctionawaredomainembeddingrepresentations