Cargando…

Evaluation of protein descriptors in computer-aided rational protein engineering tasks and its application in property prediction in SARS-CoV-2 spike glycoprotein

The importance of protein engineering in the research and development of biopharmaceuticals and biomaterials has increased. Machine learning in computer-aided protein engineering can markedly reduce the experimental effort in identifying optimal sequences that satisfy the desired properties from a l...

Descripción completa

Detalles Bibliográficos
Autores principales: Lim, Hocheol, Jeon, Hyeon-Nae, Lim, Seungcheol, Jang, Yuil, Kim, Taehee, Cho, Hyein, Pan, Jae-Gu, No, Kyoung Tai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8841378/
https://www.ncbi.nlm.nih.gov/pubmed/35222841
http://dx.doi.org/10.1016/j.csbj.2022.01.027
_version_ 1784650824647966720
author Lim, Hocheol
Jeon, Hyeon-Nae
Lim, Seungcheol
Jang, Yuil
Kim, Taehee
Cho, Hyein
Pan, Jae-Gu
No, Kyoung Tai
author_facet Lim, Hocheol
Jeon, Hyeon-Nae
Lim, Seungcheol
Jang, Yuil
Kim, Taehee
Cho, Hyein
Pan, Jae-Gu
No, Kyoung Tai
author_sort Lim, Hocheol
collection PubMed
description The importance of protein engineering in the research and development of biopharmaceuticals and biomaterials has increased. Machine learning in computer-aided protein engineering can markedly reduce the experimental effort in identifying optimal sequences that satisfy the desired properties from a large number of possible protein sequences. To develop general protein descriptors for computer-aided protein engineering tasks, we devised new protein descriptors, one sequence-based descriptor (PCgrades), and three structure-based descriptors (PCspairs, 3D-SPIEs_5.4 Å, and 3D-SPIEs_8Å). While the PCgrades and PCspairs include general and statistical information in physicochemical properties in single and pairwise amino acids respectively, the 3D-SPIEs include specific and quantum–mechanical information with parameterized quantum mechanical calculations (FMO2-DFTB3/D/PCM). To evaluate the protein descriptors, we made prediction models with the new descriptors and previously developed descriptors for diverse protein datasets including protein expression and binding affinity change in SARS-CoV-2 spike glycoprotein. As a result, the newly devised descriptors showed a good performance in diverse datasets, in which the PCspairs showed the best performance ([Formula: see text] for protein expression and [Formula: see text] for binding affinity). As a result, the newly devised descriptors showed a good performance in diverse datasets, in which the PCspairs showed the best performance. Similar approaches with those descriptors would be promising and useful if the prediction models are trained with sufficient quantitative experimental data from high-throughput assays for industrial enzymes or protein drugs.
format Online
Article
Text
id pubmed-8841378
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-88413782022-02-25 Evaluation of protein descriptors in computer-aided rational protein engineering tasks and its application in property prediction in SARS-CoV-2 spike glycoprotein Lim, Hocheol Jeon, Hyeon-Nae Lim, Seungcheol Jang, Yuil Kim, Taehee Cho, Hyein Pan, Jae-Gu No, Kyoung Tai Comput Struct Biotechnol J Research Article The importance of protein engineering in the research and development of biopharmaceuticals and biomaterials has increased. Machine learning in computer-aided protein engineering can markedly reduce the experimental effort in identifying optimal sequences that satisfy the desired properties from a large number of possible protein sequences. To develop general protein descriptors for computer-aided protein engineering tasks, we devised new protein descriptors, one sequence-based descriptor (PCgrades), and three structure-based descriptors (PCspairs, 3D-SPIEs_5.4 Å, and 3D-SPIEs_8Å). While the PCgrades and PCspairs include general and statistical information in physicochemical properties in single and pairwise amino acids respectively, the 3D-SPIEs include specific and quantum–mechanical information with parameterized quantum mechanical calculations (FMO2-DFTB3/D/PCM). To evaluate the protein descriptors, we made prediction models with the new descriptors and previously developed descriptors for diverse protein datasets including protein expression and binding affinity change in SARS-CoV-2 spike glycoprotein. As a result, the newly devised descriptors showed a good performance in diverse datasets, in which the PCspairs showed the best performance ([Formula: see text] for protein expression and [Formula: see text] for binding affinity). As a result, the newly devised descriptors showed a good performance in diverse datasets, in which the PCspairs showed the best performance. Similar approaches with those descriptors would be promising and useful if the prediction models are trained with sufficient quantitative experimental data from high-throughput assays for industrial enzymes or protein drugs. Research Network of Computational and Structural Biotechnology 2022-01-31 /pmc/articles/PMC8841378/ /pubmed/35222841 http://dx.doi.org/10.1016/j.csbj.2022.01.027 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Lim, Hocheol
Jeon, Hyeon-Nae
Lim, Seungcheol
Jang, Yuil
Kim, Taehee
Cho, Hyein
Pan, Jae-Gu
No, Kyoung Tai
Evaluation of protein descriptors in computer-aided rational protein engineering tasks and its application in property prediction in SARS-CoV-2 spike glycoprotein
title Evaluation of protein descriptors in computer-aided rational protein engineering tasks and its application in property prediction in SARS-CoV-2 spike glycoprotein
title_full Evaluation of protein descriptors in computer-aided rational protein engineering tasks and its application in property prediction in SARS-CoV-2 spike glycoprotein
title_fullStr Evaluation of protein descriptors in computer-aided rational protein engineering tasks and its application in property prediction in SARS-CoV-2 spike glycoprotein
title_full_unstemmed Evaluation of protein descriptors in computer-aided rational protein engineering tasks and its application in property prediction in SARS-CoV-2 spike glycoprotein
title_short Evaluation of protein descriptors in computer-aided rational protein engineering tasks and its application in property prediction in SARS-CoV-2 spike glycoprotein
title_sort evaluation of protein descriptors in computer-aided rational protein engineering tasks and its application in property prediction in sars-cov-2 spike glycoprotein
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8841378/
https://www.ncbi.nlm.nih.gov/pubmed/35222841
http://dx.doi.org/10.1016/j.csbj.2022.01.027
work_keys_str_mv AT limhocheol evaluationofproteindescriptorsincomputeraidedrationalproteinengineeringtasksanditsapplicationinpropertypredictioninsarscov2spikeglycoprotein
AT jeonhyeonnae evaluationofproteindescriptorsincomputeraidedrationalproteinengineeringtasksanditsapplicationinpropertypredictioninsarscov2spikeglycoprotein
AT limseungcheol evaluationofproteindescriptorsincomputeraidedrationalproteinengineeringtasksanditsapplicationinpropertypredictioninsarscov2spikeglycoprotein
AT jangyuil evaluationofproteindescriptorsincomputeraidedrationalproteinengineeringtasksanditsapplicationinpropertypredictioninsarscov2spikeglycoprotein
AT kimtaehee evaluationofproteindescriptorsincomputeraidedrationalproteinengineeringtasksanditsapplicationinpropertypredictioninsarscov2spikeglycoprotein
AT chohyein evaluationofproteindescriptorsincomputeraidedrationalproteinengineeringtasksanditsapplicationinpropertypredictioninsarscov2spikeglycoprotein
AT panjaegu evaluationofproteindescriptorsincomputeraidedrationalproteinengineeringtasksanditsapplicationinpropertypredictioninsarscov2spikeglycoprotein
AT nokyoungtai evaluationofproteindescriptorsincomputeraidedrationalproteinengineeringtasksanditsapplicationinpropertypredictioninsarscov2spikeglycoprotein