Cargando…

The whole is greater than its parts: ensembling improves protein contact prediction

The prediction of amino acid contacts from protein sequence is an important problem, as protein contacts are a vital step towards the prediction of folded protein structures. We propose that a powerful concept from deep learning, called ensembling, can increase the accuracy of protein contact predic...

Descripción completa

Detalles Bibliográficos
Autores principales: Billings, Wendy M., Morris, Connor J., Della Corte, Dennis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8044223/
https://www.ncbi.nlm.nih.gov/pubmed/33850214
http://dx.doi.org/10.1038/s41598-021-87524-0
_version_ 1783678442131161088
author Billings, Wendy M.
Morris, Connor J.
Della Corte, Dennis
author_facet Billings, Wendy M.
Morris, Connor J.
Della Corte, Dennis
author_sort Billings, Wendy M.
collection PubMed
description The prediction of amino acid contacts from protein sequence is an important problem, as protein contacts are a vital step towards the prediction of folded protein structures. We propose that a powerful concept from deep learning, called ensembling, can increase the accuracy of protein contact predictions by combining the outputs of different neural network models. We show that ensembling the predictions made by different groups at the recent Critical Assessment of Protein Structure Prediction (CASP13) outperforms all individual groups. Further, we show that contacts derived from the distance predictions of three additional deep neural networks—AlphaFold, trRosetta, and ProSPr—can be substantially improved by ensembling all three networks. We also show that ensembling these recent deep neural networks with the best CASP13 group creates a superior contact prediction tool. Finally, we demonstrate that two ensembled networks can successfully differentiate between the folds of two highly homologous sequences. In order to build further on these findings, we propose the creation of a better protein contact benchmark set and additional open-source contact prediction methods.
format Online
Article
Text
id pubmed-8044223
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-80442232021-04-14 The whole is greater than its parts: ensembling improves protein contact prediction Billings, Wendy M. Morris, Connor J. Della Corte, Dennis Sci Rep Article The prediction of amino acid contacts from protein sequence is an important problem, as protein contacts are a vital step towards the prediction of folded protein structures. We propose that a powerful concept from deep learning, called ensembling, can increase the accuracy of protein contact predictions by combining the outputs of different neural network models. We show that ensembling the predictions made by different groups at the recent Critical Assessment of Protein Structure Prediction (CASP13) outperforms all individual groups. Further, we show that contacts derived from the distance predictions of three additional deep neural networks—AlphaFold, trRosetta, and ProSPr—can be substantially improved by ensembling all three networks. We also show that ensembling these recent deep neural networks with the best CASP13 group creates a superior contact prediction tool. Finally, we demonstrate that two ensembled networks can successfully differentiate between the folds of two highly homologous sequences. In order to build further on these findings, we propose the creation of a better protein contact benchmark set and additional open-source contact prediction methods. Nature Publishing Group UK 2021-04-13 /pmc/articles/PMC8044223/ /pubmed/33850214 http://dx.doi.org/10.1038/s41598-021-87524-0 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Billings, Wendy M.
Morris, Connor J.
Della Corte, Dennis
The whole is greater than its parts: ensembling improves protein contact prediction
title The whole is greater than its parts: ensembling improves protein contact prediction
title_full The whole is greater than its parts: ensembling improves protein contact prediction
title_fullStr The whole is greater than its parts: ensembling improves protein contact prediction
title_full_unstemmed The whole is greater than its parts: ensembling improves protein contact prediction
title_short The whole is greater than its parts: ensembling improves protein contact prediction
title_sort whole is greater than its parts: ensembling improves protein contact prediction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8044223/
https://www.ncbi.nlm.nih.gov/pubmed/33850214
http://dx.doi.org/10.1038/s41598-021-87524-0
work_keys_str_mv AT billingswendym thewholeisgreaterthanitspartsensemblingimprovesproteincontactprediction
AT morrisconnorj thewholeisgreaterthanitspartsensemblingimprovesproteincontactprediction
AT dellacortedennis thewholeisgreaterthanitspartsensemblingimprovesproteincontactprediction
AT billingswendym wholeisgreaterthanitspartsensemblingimprovesproteincontactprediction
AT morrisconnorj wholeisgreaterthanitspartsensemblingimprovesproteincontactprediction
AT dellacortedennis wholeisgreaterthanitspartsensemblingimprovesproteincontactprediction