Cargando…

Fast protein structure comparison through effective representation learning with contrastive graph neural networks

Protein structure alignment algorithms are often time-consuming, resulting in challenges for large-scale protein structure similarity-based retrieval. There is an urgent need for more efficient structure comparison approaches as the number of protein structures increases rapidly. In this paper, we p...

Descripción completa

Detalles Bibliográficos
Autores principales: Xia, Chunqiu, Feng, Shi-Hao, Xia, Ying, Pan, Xiaoyong, Shen, Hong-Bin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8982879/
https://www.ncbi.nlm.nih.gov/pubmed/35324898
http://dx.doi.org/10.1371/journal.pcbi.1009986
_version_ 1784681877945188352
author Xia, Chunqiu
Feng, Shi-Hao
Xia, Ying
Pan, Xiaoyong
Shen, Hong-Bin
author_facet Xia, Chunqiu
Feng, Shi-Hao
Xia, Ying
Pan, Xiaoyong
Shen, Hong-Bin
author_sort Xia, Chunqiu
collection PubMed
description Protein structure alignment algorithms are often time-consuming, resulting in challenges for large-scale protein structure similarity-based retrieval. There is an urgent need for more efficient structure comparison approaches as the number of protein structures increases rapidly. In this paper, we propose an effective graph-based protein structure representation learning method, GraSR, for fast and accurate structure comparison. In GraSR, a graph is constructed based on the intra-residue distance derived from the tertiary structure. Then, deep graph neural networks (GNNs) with a short-cut connection learn graph representations of the tertiary structures under a contrastive learning framework. To further improve GraSR, a novel dynamic training data partition strategy and length-scaling cosine distance are introduced. We objectively evaluate our method GraSR on SCOPe v2.07 and a new released independent test set from PDB database with a designed comprehensive performance metric. Compared with other state-of-the-art methods, GraSR achieves about 7%-10% improvement on two benchmark datasets. GraSR is also much faster than alignment-based methods. We dig into the model and observe that the superiority of GraSR is mainly brought by the learned discriminative residue-level and global descriptors. The web-server and source code of GraSR are freely available at www.csbio.sjtu.edu.cn/bioinf/GraSR/ for academic use.
format Online
Article
Text
id pubmed-8982879
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-89828792022-04-06 Fast protein structure comparison through effective representation learning with contrastive graph neural networks Xia, Chunqiu Feng, Shi-Hao Xia, Ying Pan, Xiaoyong Shen, Hong-Bin PLoS Comput Biol Research Article Protein structure alignment algorithms are often time-consuming, resulting in challenges for large-scale protein structure similarity-based retrieval. There is an urgent need for more efficient structure comparison approaches as the number of protein structures increases rapidly. In this paper, we propose an effective graph-based protein structure representation learning method, GraSR, for fast and accurate structure comparison. In GraSR, a graph is constructed based on the intra-residue distance derived from the tertiary structure. Then, deep graph neural networks (GNNs) with a short-cut connection learn graph representations of the tertiary structures under a contrastive learning framework. To further improve GraSR, a novel dynamic training data partition strategy and length-scaling cosine distance are introduced. We objectively evaluate our method GraSR on SCOPe v2.07 and a new released independent test set from PDB database with a designed comprehensive performance metric. Compared with other state-of-the-art methods, GraSR achieves about 7%-10% improvement on two benchmark datasets. GraSR is also much faster than alignment-based methods. We dig into the model and observe that the superiority of GraSR is mainly brought by the learned discriminative residue-level and global descriptors. The web-server and source code of GraSR are freely available at www.csbio.sjtu.edu.cn/bioinf/GraSR/ for academic use. Public Library of Science 2022-03-24 /pmc/articles/PMC8982879/ /pubmed/35324898 http://dx.doi.org/10.1371/journal.pcbi.1009986 Text en © 2022 Xia et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Xia, Chunqiu
Feng, Shi-Hao
Xia, Ying
Pan, Xiaoyong
Shen, Hong-Bin
Fast protein structure comparison through effective representation learning with contrastive graph neural networks
title Fast protein structure comparison through effective representation learning with contrastive graph neural networks
title_full Fast protein structure comparison through effective representation learning with contrastive graph neural networks
title_fullStr Fast protein structure comparison through effective representation learning with contrastive graph neural networks
title_full_unstemmed Fast protein structure comparison through effective representation learning with contrastive graph neural networks
title_short Fast protein structure comparison through effective representation learning with contrastive graph neural networks
title_sort fast protein structure comparison through effective representation learning with contrastive graph neural networks
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8982879/
https://www.ncbi.nlm.nih.gov/pubmed/35324898
http://dx.doi.org/10.1371/journal.pcbi.1009986
work_keys_str_mv AT xiachunqiu fastproteinstructurecomparisonthrougheffectiverepresentationlearningwithcontrastivegraphneuralnetworks
AT fengshihao fastproteinstructurecomparisonthrougheffectiverepresentationlearningwithcontrastivegraphneuralnetworks
AT xiaying fastproteinstructurecomparisonthrougheffectiverepresentationlearningwithcontrastivegraphneuralnetworks
AT panxiaoyong fastproteinstructurecomparisonthrougheffectiverepresentationlearningwithcontrastivegraphneuralnetworks
AT shenhongbin fastproteinstructurecomparisonthrougheffectiverepresentationlearningwithcontrastivegraphneuralnetworks