Cargando…

Toward the solution of the protein structure prediction problem

Since Anfinsen demonstrated that the information encoded in a protein’s amino acid sequence determines its structure in 1973, solving the protein structure prediction problem has been the Holy Grail of structural biology. The goal of protein structure prediction approaches is to utilize computationa...

Descripción completa

Detalles Bibliográficos
Autores principales: Pearce, Robin, Zhang, Yang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Biochemistry and Molecular Biology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8254035/
https://www.ncbi.nlm.nih.gov/pubmed/34119522
http://dx.doi.org/10.1016/j.jbc.2021.100870
_version_ 1783717642876485632
author Pearce, Robin
Zhang, Yang
author_facet Pearce, Robin
Zhang, Yang
author_sort Pearce, Robin
collection PubMed
description Since Anfinsen demonstrated that the information encoded in a protein’s amino acid sequence determines its structure in 1973, solving the protein structure prediction problem has been the Holy Grail of structural biology. The goal of protein structure prediction approaches is to utilize computational modeling to determine the spatial location of every atom in a protein molecule starting from only its amino acid sequence. Depending on whether homologous structures can be found in the Protein Data Bank (PDB), structure prediction methods have been historically categorized as template-based modeling (TBM) or template-free modeling (FM) approaches. Until recently, TBM has been the most reliable approach to predicting protein structures, and in the absence of reliable templates, the modeling accuracy sharply declines. Nevertheless, the results of the most recent community-wide assessment of protein structure prediction experiment (CASP14) have demonstrated that the protein structure prediction problem can be largely solved through the use of end-to-end deep machine learning techniques, where correct folds could be built for nearly all single-domain proteins without using the PDB templates. Critically, the model quality exhibited little correlation with the quality of available template structures, as well as the number of sequence homologs detected for a given target protein. Thus, the implementation of deep-learning techniques has essentially broken through the 50-year-old modeling border between TBM and FM approaches and has made the success of high-resolution structure prediction significantly less dependent on template availability in the PDB library.
format Online
Article
Text
id pubmed-8254035
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Society for Biochemistry and Molecular Biology
record_format MEDLINE/PubMed
spelling pubmed-82540352021-07-12 Toward the solution of the protein structure prediction problem Pearce, Robin Zhang, Yang J Biol Chem JBC Reviews Since Anfinsen demonstrated that the information encoded in a protein’s amino acid sequence determines its structure in 1973, solving the protein structure prediction problem has been the Holy Grail of structural biology. The goal of protein structure prediction approaches is to utilize computational modeling to determine the spatial location of every atom in a protein molecule starting from only its amino acid sequence. Depending on whether homologous structures can be found in the Protein Data Bank (PDB), structure prediction methods have been historically categorized as template-based modeling (TBM) or template-free modeling (FM) approaches. Until recently, TBM has been the most reliable approach to predicting protein structures, and in the absence of reliable templates, the modeling accuracy sharply declines. Nevertheless, the results of the most recent community-wide assessment of protein structure prediction experiment (CASP14) have demonstrated that the protein structure prediction problem can be largely solved through the use of end-to-end deep machine learning techniques, where correct folds could be built for nearly all single-domain proteins without using the PDB templates. Critically, the model quality exhibited little correlation with the quality of available template structures, as well as the number of sequence homologs detected for a given target protein. Thus, the implementation of deep-learning techniques has essentially broken through the 50-year-old modeling border between TBM and FM approaches and has made the success of high-resolution structure prediction significantly less dependent on template availability in the PDB library. American Society for Biochemistry and Molecular Biology 2021-06-11 /pmc/articles/PMC8254035/ /pubmed/34119522 http://dx.doi.org/10.1016/j.jbc.2021.100870 Text en © 2021 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle JBC Reviews
Pearce, Robin
Zhang, Yang
Toward the solution of the protein structure prediction problem
title Toward the solution of the protein structure prediction problem
title_full Toward the solution of the protein structure prediction problem
title_fullStr Toward the solution of the protein structure prediction problem
title_full_unstemmed Toward the solution of the protein structure prediction problem
title_short Toward the solution of the protein structure prediction problem
title_sort toward the solution of the protein structure prediction problem
topic JBC Reviews
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8254035/
https://www.ncbi.nlm.nih.gov/pubmed/34119522
http://dx.doi.org/10.1016/j.jbc.2021.100870
work_keys_str_mv AT pearcerobin towardthesolutionoftheproteinstructurepredictionproblem
AT zhangyang towardthesolutionoftheproteinstructurepredictionproblem