Cargando…

Accurate prediction of chemical shifts for aqueous protein structure on “Real World” data

Here we report a new machine learning algorithm for protein chemical shift prediction that outperforms existing chemical shift calculators on realistic data that is not heavily curated, nor eliminates test predictions ad hoc. Our UCBShift predictor implements two modules: a transfer prediction modul...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Jie, Bennett, Kochise C., Liu, Yuchen, Martin, Michael V., Head-Gordon, Teresa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society of Chemistry 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8152569/
https://www.ncbi.nlm.nih.gov/pubmed/34122823
http://dx.doi.org/10.1039/c9sc06561j
_version_ 1783698625089503232
author Li, Jie
Bennett, Kochise C.
Liu, Yuchen
Martin, Michael V.
Head-Gordon, Teresa
author_facet Li, Jie
Bennett, Kochise C.
Liu, Yuchen
Martin, Michael V.
Head-Gordon, Teresa
author_sort Li, Jie
collection PubMed
description Here we report a new machine learning algorithm for protein chemical shift prediction that outperforms existing chemical shift calculators on realistic data that is not heavily curated, nor eliminates test predictions ad hoc. Our UCBShift predictor implements two modules: a transfer prediction module that employs both sequence and structural alignment to select reference candidates for experimental chemical shift replication, and a redesigned machine learning module based on random forest regression which utilizes more, and more carefully curated, feature extracted data. When combined together, this new predictor achieves state-of-the-art accuracy for predicting chemical shifts on a randomly selected dataset without careful curation, with root-mean-square errors of 0.31 ppm for amide hydrogens, 0.19 ppm for Hα, 0.84 ppm for C′, 0.81 ppm for Cα, 1.00 ppm for Cβ, and 1.81 ppm for N. When similar sequences or structurally related proteins are available, UCBShift shows superior native state selection from misfolded decoy sets compared to SPARTA+ and SHIFTX2, and even without homology we exceed current prediction accuracy of all other popular chemical shift predictors.
format Online
Article
Text
id pubmed-8152569
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher The Royal Society of Chemistry
record_format MEDLINE/PubMed
spelling pubmed-81525692021-06-11 Accurate prediction of chemical shifts for aqueous protein structure on “Real World” data Li, Jie Bennett, Kochise C. Liu, Yuchen Martin, Michael V. Head-Gordon, Teresa Chem Sci Chemistry Here we report a new machine learning algorithm for protein chemical shift prediction that outperforms existing chemical shift calculators on realistic data that is not heavily curated, nor eliminates test predictions ad hoc. Our UCBShift predictor implements two modules: a transfer prediction module that employs both sequence and structural alignment to select reference candidates for experimental chemical shift replication, and a redesigned machine learning module based on random forest regression which utilizes more, and more carefully curated, feature extracted data. When combined together, this new predictor achieves state-of-the-art accuracy for predicting chemical shifts on a randomly selected dataset without careful curation, with root-mean-square errors of 0.31 ppm for amide hydrogens, 0.19 ppm for Hα, 0.84 ppm for C′, 0.81 ppm for Cα, 1.00 ppm for Cβ, and 1.81 ppm for N. When similar sequences or structurally related proteins are available, UCBShift shows superior native state selection from misfolded decoy sets compared to SPARTA+ and SHIFTX2, and even without homology we exceed current prediction accuracy of all other popular chemical shift predictors. The Royal Society of Chemistry 2020-03-03 /pmc/articles/PMC8152569/ /pubmed/34122823 http://dx.doi.org/10.1039/c9sc06561j Text en This journal is © The Royal Society of Chemistry https://creativecommons.org/licenses/by-nc/3.0/
spellingShingle Chemistry
Li, Jie
Bennett, Kochise C.
Liu, Yuchen
Martin, Michael V.
Head-Gordon, Teresa
Accurate prediction of chemical shifts for aqueous protein structure on “Real World” data
title Accurate prediction of chemical shifts for aqueous protein structure on “Real World” data
title_full Accurate prediction of chemical shifts for aqueous protein structure on “Real World” data
title_fullStr Accurate prediction of chemical shifts for aqueous protein structure on “Real World” data
title_full_unstemmed Accurate prediction of chemical shifts for aqueous protein structure on “Real World” data
title_short Accurate prediction of chemical shifts for aqueous protein structure on “Real World” data
title_sort accurate prediction of chemical shifts for aqueous protein structure on “real world” data
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8152569/
https://www.ncbi.nlm.nih.gov/pubmed/34122823
http://dx.doi.org/10.1039/c9sc06561j
work_keys_str_mv AT lijie accuratepredictionofchemicalshiftsforaqueousproteinstructureonrealworlddata
AT bennettkochisec accuratepredictionofchemicalshiftsforaqueousproteinstructureonrealworlddata
AT liuyuchen accuratepredictionofchemicalshiftsforaqueousproteinstructureonrealworlddata
AT martinmichaelv accuratepredictionofchemicalshiftsforaqueousproteinstructureonrealworlddata
AT headgordonteresa accuratepredictionofchemicalshiftsforaqueousproteinstructureonrealworlddata