Cargando…
The whole is greater than its parts: ensembling improves protein contact prediction
The prediction of amino acid contacts from protein sequence is an important problem, as protein contacts are a vital step towards the prediction of folded protein structures. We propose that a powerful concept from deep learning, called ensembling, can increase the accuracy of protein contact predic...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8044223/ https://www.ncbi.nlm.nih.gov/pubmed/33850214 http://dx.doi.org/10.1038/s41598-021-87524-0 |
_version_ | 1783678442131161088 |
---|---|
author | Billings, Wendy M. Morris, Connor J. Della Corte, Dennis |
author_facet | Billings, Wendy M. Morris, Connor J. Della Corte, Dennis |
author_sort | Billings, Wendy M. |
collection | PubMed |
description | The prediction of amino acid contacts from protein sequence is an important problem, as protein contacts are a vital step towards the prediction of folded protein structures. We propose that a powerful concept from deep learning, called ensembling, can increase the accuracy of protein contact predictions by combining the outputs of different neural network models. We show that ensembling the predictions made by different groups at the recent Critical Assessment of Protein Structure Prediction (CASP13) outperforms all individual groups. Further, we show that contacts derived from the distance predictions of three additional deep neural networks—AlphaFold, trRosetta, and ProSPr—can be substantially improved by ensembling all three networks. We also show that ensembling these recent deep neural networks with the best CASP13 group creates a superior contact prediction tool. Finally, we demonstrate that two ensembled networks can successfully differentiate between the folds of two highly homologous sequences. In order to build further on these findings, we propose the creation of a better protein contact benchmark set and additional open-source contact prediction methods. |
format | Online Article Text |
id | pubmed-8044223 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-80442232021-04-14 The whole is greater than its parts: ensembling improves protein contact prediction Billings, Wendy M. Morris, Connor J. Della Corte, Dennis Sci Rep Article The prediction of amino acid contacts from protein sequence is an important problem, as protein contacts are a vital step towards the prediction of folded protein structures. We propose that a powerful concept from deep learning, called ensembling, can increase the accuracy of protein contact predictions by combining the outputs of different neural network models. We show that ensembling the predictions made by different groups at the recent Critical Assessment of Protein Structure Prediction (CASP13) outperforms all individual groups. Further, we show that contacts derived from the distance predictions of three additional deep neural networks—AlphaFold, trRosetta, and ProSPr—can be substantially improved by ensembling all three networks. We also show that ensembling these recent deep neural networks with the best CASP13 group creates a superior contact prediction tool. Finally, we demonstrate that two ensembled networks can successfully differentiate between the folds of two highly homologous sequences. In order to build further on these findings, we propose the creation of a better protein contact benchmark set and additional open-source contact prediction methods. Nature Publishing Group UK 2021-04-13 /pmc/articles/PMC8044223/ /pubmed/33850214 http://dx.doi.org/10.1038/s41598-021-87524-0 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Billings, Wendy M. Morris, Connor J. Della Corte, Dennis The whole is greater than its parts: ensembling improves protein contact prediction |
title | The whole is greater than its parts: ensembling improves protein contact prediction |
title_full | The whole is greater than its parts: ensembling improves protein contact prediction |
title_fullStr | The whole is greater than its parts: ensembling improves protein contact prediction |
title_full_unstemmed | The whole is greater than its parts: ensembling improves protein contact prediction |
title_short | The whole is greater than its parts: ensembling improves protein contact prediction |
title_sort | whole is greater than its parts: ensembling improves protein contact prediction |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8044223/ https://www.ncbi.nlm.nih.gov/pubmed/33850214 http://dx.doi.org/10.1038/s41598-021-87524-0 |
work_keys_str_mv | AT billingswendym thewholeisgreaterthanitspartsensemblingimprovesproteincontactprediction AT morrisconnorj thewholeisgreaterthanitspartsensemblingimprovesproteincontactprediction AT dellacortedennis thewholeisgreaterthanitspartsensemblingimprovesproteincontactprediction AT billingswendym wholeisgreaterthanitspartsensemblingimprovesproteincontactprediction AT morrisconnorj wholeisgreaterthanitspartsensemblingimprovesproteincontactprediction AT dellacortedennis wholeisgreaterthanitspartsensemblingimprovesproteincontactprediction |