Cargando…

Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks

The topology of protein folds can be specified by the inter-residue contact-maps and accurate contact-map prediction can help ab initio structure folding. We developed TripletRes to deduce protein contact-maps from discretized distance profiles by end-to-end training of deep residual neural-networks...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Yang, Zhang, Chengxin, Bell, Eric W., Zheng, Wei, Zhou, Xiaogen, Yu, Dong-Jun, Zhang, Yang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8026059/
https://www.ncbi.nlm.nih.gov/pubmed/33770072
http://dx.doi.org/10.1371/journal.pcbi.1008865
_version_ 1783675604089962496
author Li, Yang
Zhang, Chengxin
Bell, Eric W.
Zheng, Wei
Zhou, Xiaogen
Yu, Dong-Jun
Zhang, Yang
author_facet Li, Yang
Zhang, Chengxin
Bell, Eric W.
Zheng, Wei
Zhou, Xiaogen
Yu, Dong-Jun
Zhang, Yang
author_sort Li, Yang
collection PubMed
description The topology of protein folds can be specified by the inter-residue contact-maps and accurate contact-map prediction can help ab initio structure folding. We developed TripletRes to deduce protein contact-maps from discretized distance profiles by end-to-end training of deep residual neural-networks. Compared to previous approaches, the major advantage of TripletRes is in its ability to learn and directly fuse a triplet of coevolutionary matrices extracted from the whole-genome and metagenome databases and therefore minimize the information loss during the course of contact model training. TripletRes was tested on a large set of 245 non-homologous proteins from CASP 11&12 and CAMEO experiments and outperformed other top methods from CASP12 by at least 58.4% for the CASP 11&12 targets and 44.4% for the CAMEO targets in the top-L long-range contact precision. On the 31 FM targets from the latest CASP13 challenge, TripletRes achieved the highest precision (71.6%) for the top-L/5 long-range contact predictions. It was also shown that a simple re-training of the TripletRes model with more proteins can lead to further improvement with precisions comparable to state-of-the-art methods developed after CASP13. These results demonstrate a novel efficient approach to extend the power of deep convolutional networks for high-accuracy medium- and long-range protein contact-map predictions starting from primary sequences, which are critical for constructing 3D structure of proteins that lack homologous templates in the PDB library.
format Online
Article
Text
id pubmed-8026059
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-80260592021-04-15 Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks Li, Yang Zhang, Chengxin Bell, Eric W. Zheng, Wei Zhou, Xiaogen Yu, Dong-Jun Zhang, Yang PLoS Comput Biol Research Article The topology of protein folds can be specified by the inter-residue contact-maps and accurate contact-map prediction can help ab initio structure folding. We developed TripletRes to deduce protein contact-maps from discretized distance profiles by end-to-end training of deep residual neural-networks. Compared to previous approaches, the major advantage of TripletRes is in its ability to learn and directly fuse a triplet of coevolutionary matrices extracted from the whole-genome and metagenome databases and therefore minimize the information loss during the course of contact model training. TripletRes was tested on a large set of 245 non-homologous proteins from CASP 11&12 and CAMEO experiments and outperformed other top methods from CASP12 by at least 58.4% for the CASP 11&12 targets and 44.4% for the CAMEO targets in the top-L long-range contact precision. On the 31 FM targets from the latest CASP13 challenge, TripletRes achieved the highest precision (71.6%) for the top-L/5 long-range contact predictions. It was also shown that a simple re-training of the TripletRes model with more proteins can lead to further improvement with precisions comparable to state-of-the-art methods developed after CASP13. These results demonstrate a novel efficient approach to extend the power of deep convolutional networks for high-accuracy medium- and long-range protein contact-map predictions starting from primary sequences, which are critical for constructing 3D structure of proteins that lack homologous templates in the PDB library. Public Library of Science 2021-03-26 /pmc/articles/PMC8026059/ /pubmed/33770072 http://dx.doi.org/10.1371/journal.pcbi.1008865 Text en © 2021 Li et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Li, Yang
Zhang, Chengxin
Bell, Eric W.
Zheng, Wei
Zhou, Xiaogen
Yu, Dong-Jun
Zhang, Yang
Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks
title Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks
title_full Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks
title_fullStr Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks
title_full_unstemmed Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks
title_short Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks
title_sort deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8026059/
https://www.ncbi.nlm.nih.gov/pubmed/33770072
http://dx.doi.org/10.1371/journal.pcbi.1008865
work_keys_str_mv AT liyang deducinghighaccuracyproteincontactmapsfromatripletofcoevolutionarymatricesthroughdeepresidualconvolutionalnetworks
AT zhangchengxin deducinghighaccuracyproteincontactmapsfromatripletofcoevolutionarymatricesthroughdeepresidualconvolutionalnetworks
AT bellericw deducinghighaccuracyproteincontactmapsfromatripletofcoevolutionarymatricesthroughdeepresidualconvolutionalnetworks
AT zhengwei deducinghighaccuracyproteincontactmapsfromatripletofcoevolutionarymatricesthroughdeepresidualconvolutionalnetworks
AT zhouxiaogen deducinghighaccuracyproteincontactmapsfromatripletofcoevolutionarymatricesthroughdeepresidualconvolutionalnetworks
AT yudongjun deducinghighaccuracyproteincontactmapsfromatripletofcoevolutionarymatricesthroughdeepresidualconvolutionalnetworks
AT zhangyang deducinghighaccuracyproteincontactmapsfromatripletofcoevolutionarymatricesthroughdeepresidualconvolutionalnetworks