Cargando…

QAUST: Protein Function Prediction Using Structure Similarity, Protein Interaction, and Functional Motifs

The number of available protein sequences in public databases is increasing exponentially. However, a significant percentage of these sequences lack functional annotation, which is essential for the understanding of how biological systems operate. Here, we propose a novel method, Quantitative Annota...

Descripción completa

Detalles Bibliográficos
Autores principales: Smaili, Fatima Zohra, Tian, Shuye, Roy, Ambrish, Alazmi, Meshari, Arold, Stefan T., Mukherjee, Srayanta, Hefty, P. Scott, Chen, Wei, Gao, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9403031/
https://www.ncbi.nlm.nih.gov/pubmed/33631427
http://dx.doi.org/10.1016/j.gpb.2021.02.001
_version_ 1784773278906187776
author Smaili, Fatima Zohra
Tian, Shuye
Roy, Ambrish
Alazmi, Meshari
Arold, Stefan T.
Mukherjee, Srayanta
Hefty, P. Scott
Chen, Wei
Gao, Xin
author_facet Smaili, Fatima Zohra
Tian, Shuye
Roy, Ambrish
Alazmi, Meshari
Arold, Stefan T.
Mukherjee, Srayanta
Hefty, P. Scott
Chen, Wei
Gao, Xin
author_sort Smaili, Fatima Zohra
collection PubMed
description The number of available protein sequences in public databases is increasing exponentially. However, a significant percentage of these sequences lack functional annotation, which is essential for the understanding of how biological systems operate. Here, we propose a novel method, Quantitative Annotation of Unknown STructure (QAUST), to infer protein functions, specifically Gene Ontology (GO) terms and Enzyme Commission (EC) numbers. QAUST uses three sources of information: structure information encoded by global and local structure similarity search, biological network information inferred by protein–protein interaction data, and sequence information extracted from functionally discriminative sequence motifs. These three pieces of information are combined by consensus averaging to make the final prediction. Our approach has been tested on 500 protein targets from the Critical Assessment of Functional Annotation (CAFA) benchmark set. The results show that our method provides accurate functional annotation and outperforms other prediction methods based on sequence similarity search or threading. We further demonstrate that a previously unknown function of human tripartite motif-containing 22 (TRIM22) protein predicted by QAUST can be experimentally validated.
format Online
Article
Text
id pubmed-9403031
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-94030312022-08-26 QAUST: Protein Function Prediction Using Structure Similarity, Protein Interaction, and Functional Motifs Smaili, Fatima Zohra Tian, Shuye Roy, Ambrish Alazmi, Meshari Arold, Stefan T. Mukherjee, Srayanta Hefty, P. Scott Chen, Wei Gao, Xin Genomics Proteomics Bioinformatics Method The number of available protein sequences in public databases is increasing exponentially. However, a significant percentage of these sequences lack functional annotation, which is essential for the understanding of how biological systems operate. Here, we propose a novel method, Quantitative Annotation of Unknown STructure (QAUST), to infer protein functions, specifically Gene Ontology (GO) terms and Enzyme Commission (EC) numbers. QAUST uses three sources of information: structure information encoded by global and local structure similarity search, biological network information inferred by protein–protein interaction data, and sequence information extracted from functionally discriminative sequence motifs. These three pieces of information are combined by consensus averaging to make the final prediction. Our approach has been tested on 500 protein targets from the Critical Assessment of Functional Annotation (CAFA) benchmark set. The results show that our method provides accurate functional annotation and outperforms other prediction methods based on sequence similarity search or threading. We further demonstrate that a previously unknown function of human tripartite motif-containing 22 (TRIM22) protein predicted by QAUST can be experimentally validated. Elsevier 2021-12 2021-02-23 /pmc/articles/PMC9403031/ /pubmed/33631427 http://dx.doi.org/10.1016/j.gpb.2021.02.001 Text en © 2021 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Method
Smaili, Fatima Zohra
Tian, Shuye
Roy, Ambrish
Alazmi, Meshari
Arold, Stefan T.
Mukherjee, Srayanta
Hefty, P. Scott
Chen, Wei
Gao, Xin
QAUST: Protein Function Prediction Using Structure Similarity, Protein Interaction, and Functional Motifs
title QAUST: Protein Function Prediction Using Structure Similarity, Protein Interaction, and Functional Motifs
title_full QAUST: Protein Function Prediction Using Structure Similarity, Protein Interaction, and Functional Motifs
title_fullStr QAUST: Protein Function Prediction Using Structure Similarity, Protein Interaction, and Functional Motifs
title_full_unstemmed QAUST: Protein Function Prediction Using Structure Similarity, Protein Interaction, and Functional Motifs
title_short QAUST: Protein Function Prediction Using Structure Similarity, Protein Interaction, and Functional Motifs
title_sort qaust: protein function prediction using structure similarity, protein interaction, and functional motifs
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9403031/
https://www.ncbi.nlm.nih.gov/pubmed/33631427
http://dx.doi.org/10.1016/j.gpb.2021.02.001
work_keys_str_mv AT smailifatimazohra qaustproteinfunctionpredictionusingstructuresimilarityproteininteractionandfunctionalmotifs
AT tianshuye qaustproteinfunctionpredictionusingstructuresimilarityproteininteractionandfunctionalmotifs
AT royambrish qaustproteinfunctionpredictionusingstructuresimilarityproteininteractionandfunctionalmotifs
AT alazmimeshari qaustproteinfunctionpredictionusingstructuresimilarityproteininteractionandfunctionalmotifs
AT aroldstefant qaustproteinfunctionpredictionusingstructuresimilarityproteininteractionandfunctionalmotifs
AT mukherjeesrayanta qaustproteinfunctionpredictionusingstructuresimilarityproteininteractionandfunctionalmotifs
AT heftypscott qaustproteinfunctionpredictionusingstructuresimilarityproteininteractionandfunctionalmotifs
AT chenwei qaustproteinfunctionpredictionusingstructuresimilarityproteininteractionandfunctionalmotifs
AT gaoxin qaustproteinfunctionpredictionusingstructuresimilarityproteininteractionandfunctionalmotifs