Template-based prediction of protein structure with deep learning
BACKGROUND: Accurate prediction of protein structure is fundamentally important to understand biological function of proteins. Template-based modeling, including protein threading and homology modeling, is a popular method for protein tertiary structure prediction. However, accurate template-query a...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7771081/ https://www.ncbi.nlm.nih.gov/pubmed/33372607 http://dx.doi.org/10.1186/s12864-020-07249-8 |
_version_ | 1783629643088134144 |
---|---|
author | Zhang, Haicang Shen, Yufeng |
author_facet | Zhang, Haicang Shen, Yufeng |
author_sort | Zhang, Haicang |
collection | PubMed |
description | BACKGROUND: Accurate prediction of protein structure is fundamentally important to understand biological function of proteins. Template-based modeling, including protein threading and homology modeling, is a popular method for protein tertiary structure prediction. However, accurate template-query alignment and template selection are still very challenging, especially for the proteins with only distant homologs available. RESULTS: We propose a new template-based modelling method called ThreaderAI to improve protein tertiary structure prediction. ThreaderAI formulates the task of aligning query sequence with template as the classical pixel classification problem in computer vision and naturally applies deep residual neural network in prediction. ThreaderAI first employs deep learning to predict residue-residue aligning probability matrix by integrating sequence profile, predicted sequential structural features, and predicted residue-residue contacts, and then builds template-query alignment by applying a dynamic programming algorithm on the probability matrix. We evaluated our methods both in generating accurate template-query alignment and protein threading. Experimental results show that ThreaderAI outperforms currently popular template-based modelling methods HHpred, CNFpred, and the latest contact-assisted method CEthreader, especially on the proteins that do not have close homologs with known structures. In particular, in terms of alignment accuracy measured with TM-score, ThreaderAI outperforms HHpred, CNFpred, and CEthreader by 56, 13, and 11%, respectively, on template-query pairs at the similarity of fold level from SCOPe data. And on CASP13’s TBM-hard data, ThreaderAI outperforms HHpred, CNFpred, and CEthreader by 16, 9 and 8% in terms of TM-score, respectively. CONCLUSIONS: These results demonstrate that with the help of deep learning, ThreaderAI can significantly improve the accuracy of template-based structure prediction, especially for distant-homology proteins. |
format | Online Article Text |
id | pubmed-7771081 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-77710812020-12-30 Template-based prediction of protein structure with deep learning Zhang, Haicang Shen, Yufeng BMC Genomics Methodology BACKGROUND: Accurate prediction of protein structure is fundamentally important to understand biological function of proteins. Template-based modeling, including protein threading and homology modeling, is a popular method for protein tertiary structure prediction. However, accurate template-query alignment and template selection are still very challenging, especially for the proteins with only distant homologs available. RESULTS: We propose a new template-based modelling method called ThreaderAI to improve protein tertiary structure prediction. ThreaderAI formulates the task of aligning query sequence with template as the classical pixel classification problem in computer vision and naturally applies deep residual neural network in prediction. ThreaderAI first employs deep learning to predict residue-residue aligning probability matrix by integrating sequence profile, predicted sequential structural features, and predicted residue-residue contacts, and then builds template-query alignment by applying a dynamic programming algorithm on the probability matrix. We evaluated our methods both in generating accurate template-query alignment and protein threading. Experimental results show that ThreaderAI outperforms currently popular template-based modelling methods HHpred, CNFpred, and the latest contact-assisted method CEthreader, especially on the proteins that do not have close homologs with known structures. In particular, in terms of alignment accuracy measured with TM-score, ThreaderAI outperforms HHpred, CNFpred, and CEthreader by 56, 13, and 11%, respectively, on template-query pairs at the similarity of fold level from SCOPe data. And on CASP13’s TBM-hard data, ThreaderAI outperforms HHpred, CNFpred, and CEthreader by 16, 9 and 8% in terms of TM-score, respectively. CONCLUSIONS: These results demonstrate that with the help of deep learning, ThreaderAI can significantly improve the accuracy of template-based structure prediction, especially for distant-homology proteins. BioMed Central 2020-12-29 /pmc/articles/PMC7771081/ /pubmed/33372607 http://dx.doi.org/10.1186/s12864-020-07249-8 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Zhang, Haicang Shen, Yufeng Template-based prediction of protein structure with deep learning |
title | Template-based prediction of protein structure with deep learning |
title_full | Template-based prediction of protein structure with deep learning |
title_fullStr | Template-based prediction of protein structure with deep learning |
title_full_unstemmed | Template-based prediction of protein structure with deep learning |
title_short | Template-based prediction of protein structure with deep learning |
title_sort | template-based prediction of protein structure with deep learning |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7771081/ https://www.ncbi.nlm.nih.gov/pubmed/33372607 http://dx.doi.org/10.1186/s12864-020-07249-8 |
work_keys_str_mv | AT zhanghaicang templatebasedpredictionofproteinstructurewithdeeplearning AT shenyufeng templatebasedpredictionofproteinstructurewithdeeplearning |