Cargando…
Fast protein structure comparison through effective representation learning with contrastive graph neural networks
Protein structure alignment algorithms are often time-consuming, resulting in challenges for large-scale protein structure similarity-based retrieval. There is an urgent need for more efficient structure comparison approaches as the number of protein structures increases rapidly. In this paper, we p...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8982879/ https://www.ncbi.nlm.nih.gov/pubmed/35324898 http://dx.doi.org/10.1371/journal.pcbi.1009986 |
_version_ | 1784681877945188352 |
---|---|
author | Xia, Chunqiu Feng, Shi-Hao Xia, Ying Pan, Xiaoyong Shen, Hong-Bin |
author_facet | Xia, Chunqiu Feng, Shi-Hao Xia, Ying Pan, Xiaoyong Shen, Hong-Bin |
author_sort | Xia, Chunqiu |
collection | PubMed |
description | Protein structure alignment algorithms are often time-consuming, resulting in challenges for large-scale protein structure similarity-based retrieval. There is an urgent need for more efficient structure comparison approaches as the number of protein structures increases rapidly. In this paper, we propose an effective graph-based protein structure representation learning method, GraSR, for fast and accurate structure comparison. In GraSR, a graph is constructed based on the intra-residue distance derived from the tertiary structure. Then, deep graph neural networks (GNNs) with a short-cut connection learn graph representations of the tertiary structures under a contrastive learning framework. To further improve GraSR, a novel dynamic training data partition strategy and length-scaling cosine distance are introduced. We objectively evaluate our method GraSR on SCOPe v2.07 and a new released independent test set from PDB database with a designed comprehensive performance metric. Compared with other state-of-the-art methods, GraSR achieves about 7%-10% improvement on two benchmark datasets. GraSR is also much faster than alignment-based methods. We dig into the model and observe that the superiority of GraSR is mainly brought by the learned discriminative residue-level and global descriptors. The web-server and source code of GraSR are freely available at www.csbio.sjtu.edu.cn/bioinf/GraSR/ for academic use. |
format | Online Article Text |
id | pubmed-8982879 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-89828792022-04-06 Fast protein structure comparison through effective representation learning with contrastive graph neural networks Xia, Chunqiu Feng, Shi-Hao Xia, Ying Pan, Xiaoyong Shen, Hong-Bin PLoS Comput Biol Research Article Protein structure alignment algorithms are often time-consuming, resulting in challenges for large-scale protein structure similarity-based retrieval. There is an urgent need for more efficient structure comparison approaches as the number of protein structures increases rapidly. In this paper, we propose an effective graph-based protein structure representation learning method, GraSR, for fast and accurate structure comparison. In GraSR, a graph is constructed based on the intra-residue distance derived from the tertiary structure. Then, deep graph neural networks (GNNs) with a short-cut connection learn graph representations of the tertiary structures under a contrastive learning framework. To further improve GraSR, a novel dynamic training data partition strategy and length-scaling cosine distance are introduced. We objectively evaluate our method GraSR on SCOPe v2.07 and a new released independent test set from PDB database with a designed comprehensive performance metric. Compared with other state-of-the-art methods, GraSR achieves about 7%-10% improvement on two benchmark datasets. GraSR is also much faster than alignment-based methods. We dig into the model and observe that the superiority of GraSR is mainly brought by the learned discriminative residue-level and global descriptors. The web-server and source code of GraSR are freely available at www.csbio.sjtu.edu.cn/bioinf/GraSR/ for academic use. Public Library of Science 2022-03-24 /pmc/articles/PMC8982879/ /pubmed/35324898 http://dx.doi.org/10.1371/journal.pcbi.1009986 Text en © 2022 Xia et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Xia, Chunqiu Feng, Shi-Hao Xia, Ying Pan, Xiaoyong Shen, Hong-Bin Fast protein structure comparison through effective representation learning with contrastive graph neural networks |
title | Fast protein structure comparison through effective representation learning with contrastive graph neural networks |
title_full | Fast protein structure comparison through effective representation learning with contrastive graph neural networks |
title_fullStr | Fast protein structure comparison through effective representation learning with contrastive graph neural networks |
title_full_unstemmed | Fast protein structure comparison through effective representation learning with contrastive graph neural networks |
title_short | Fast protein structure comparison through effective representation learning with contrastive graph neural networks |
title_sort | fast protein structure comparison through effective representation learning with contrastive graph neural networks |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8982879/ https://www.ncbi.nlm.nih.gov/pubmed/35324898 http://dx.doi.org/10.1371/journal.pcbi.1009986 |
work_keys_str_mv | AT xiachunqiu fastproteinstructurecomparisonthrougheffectiverepresentationlearningwithcontrastivegraphneuralnetworks AT fengshihao fastproteinstructurecomparisonthrougheffectiverepresentationlearningwithcontrastivegraphneuralnetworks AT xiaying fastproteinstructurecomparisonthrougheffectiverepresentationlearningwithcontrastivegraphneuralnetworks AT panxiaoyong fastproteinstructurecomparisonthrougheffectiverepresentationlearningwithcontrastivegraphneuralnetworks AT shenhongbin fastproteinstructurecomparisonthrougheffectiverepresentationlearningwithcontrastivegraphneuralnetworks |