Cargando…
Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations
Domains are functional and structural units of proteins that govern various biological functions performed by the proteins. Therefore, the characterization of domains in a protein can serve as a proper functional representation of proteins. Here, we employ a self-supervised protocol to derive functi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10473699/ https://www.ncbi.nlm.nih.gov/pubmed/37662252 http://dx.doi.org/10.1101/2023.08.23.554486 |
_version_ | 1785100322416361472 |
---|---|
author | Ibtehaz, Nabil Kagaya, Yuki Kihara, Daisuke |
author_facet | Ibtehaz, Nabil Kagaya, Yuki Kihara, Daisuke |
author_sort | Ibtehaz, Nabil |
collection | PubMed |
description | Domains are functional and structural units of proteins that govern various biological functions performed by the proteins. Therefore, the characterization of domains in a protein can serve as a proper functional representation of proteins. Here, we employ a self-supervised protocol to derive functionally consistent representations for domains by learning domain-Gene Ontology (GO) co-occurrences and associations. The domain embeddings we constructed turned out to be effective in performing actual function prediction tasks. Extensive evaluations showed that protein representations using the domain embeddings are superior to those of large-scale protein language models in GO prediction tasks. Moreover, the new function prediction method built on the domain embeddings, named Domain-PFP, significantly outperformed the state-of-the-art function predictors. Additionally, Domain-PFP demonstrated competitive performance in the CAFA3 evaluation, achieving overall the best performance among the top teams that participated in the assessment. |
format | Online Article Text |
id | pubmed-10473699 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-104736992023-09-02 Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations Ibtehaz, Nabil Kagaya, Yuki Kihara, Daisuke bioRxiv Article Domains are functional and structural units of proteins that govern various biological functions performed by the proteins. Therefore, the characterization of domains in a protein can serve as a proper functional representation of proteins. Here, we employ a self-supervised protocol to derive functionally consistent representations for domains by learning domain-Gene Ontology (GO) co-occurrences and associations. The domain embeddings we constructed turned out to be effective in performing actual function prediction tasks. Extensive evaluations showed that protein representations using the domain embeddings are superior to those of large-scale protein language models in GO prediction tasks. Moreover, the new function prediction method built on the domain embeddings, named Domain-PFP, significantly outperformed the state-of-the-art function predictors. Additionally, Domain-PFP demonstrated competitive performance in the CAFA3 evaluation, achieving overall the best performance among the top teams that participated in the assessment. Cold Spring Harbor Laboratory 2023-08-24 /pmc/articles/PMC10473699/ /pubmed/37662252 http://dx.doi.org/10.1101/2023.08.23.554486 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Ibtehaz, Nabil Kagaya, Yuki Kihara, Daisuke Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations |
title | Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations |
title_full | Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations |
title_fullStr | Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations |
title_full_unstemmed | Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations |
title_short | Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations |
title_sort | domain-pfp: protein function prediction using function-aware domain embedding representations |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10473699/ https://www.ncbi.nlm.nih.gov/pubmed/37662252 http://dx.doi.org/10.1101/2023.08.23.554486 |
work_keys_str_mv | AT ibtehaznabil domainpfpproteinfunctionpredictionusingfunctionawaredomainembeddingrepresentations AT kagayayuki domainpfpproteinfunctionpredictionusingfunctionawaredomainembeddingrepresentations AT kiharadaisuke domainpfpproteinfunctionpredictionusingfunctionawaredomainembeddingrepresentations |