Cargando…

Toward the Prediction of Multi-Spin State Charges of a Heme Model by Random Forest Regression

The random forest regression (RFR) model was introduced to predict the multiple spin state charges of a heme model, which is important for the molecular dynamic simulation of the spin crossover phenomenon. In this work, a multiple spin state structure data set with 39,368 structures of the simplifie...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Wei, Li, Qing, Huang, Xian-Hui, Bie, Li-Hua, Gao, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7136535/
https://www.ncbi.nlm.nih.gov/pubmed/32296675
http://dx.doi.org/10.3389/fchem.2020.00162
_version_ 1783518271043010560
author Zhao, Wei
Li, Qing
Huang, Xian-Hui
Bie, Li-Hua
Gao, Jun
author_facet Zhao, Wei
Li, Qing
Huang, Xian-Hui
Bie, Li-Hua
Gao, Jun
author_sort Zhao, Wei
collection PubMed
description The random forest regression (RFR) model was introduced to predict the multiple spin state charges of a heme model, which is important for the molecular dynamic simulation of the spin crossover phenomenon. In this work, a multiple spin state structure data set with 39,368 structures of the simplified heme–oxygen binding model was built from the non-adiabatic dynamic simulation trajectories. The ESP charges of each atom were calculated and used as the real-valued response. The conformational adapted charge model (CAC) of three spin states was constructed by an RFR model using symmetry functions. The results show that our RFR model can effectively predict the on the fly atomic charges with the varying conformations as well as the atomic charge of different spin states in the same conformation, thus achieving the balance of accuracy and efficiency. The average mean absolute error of the predicted charges of each spin state is <0.02 e. The comparison studies on descriptors showed a maximum 0.06 e improvement in prediction of the charge of Fe(2+) by using 11 manually selected structural parameters. We hope that this model can not only provide variable parameters for developing the force field of the multi-spin state but also facilitate automation, thus enabling large-scale simulations of atomistic systems.
format Online
Article
Text
id pubmed-7136535
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-71365352020-04-15 Toward the Prediction of Multi-Spin State Charges of a Heme Model by Random Forest Regression Zhao, Wei Li, Qing Huang, Xian-Hui Bie, Li-Hua Gao, Jun Front Chem Chemistry The random forest regression (RFR) model was introduced to predict the multiple spin state charges of a heme model, which is important for the molecular dynamic simulation of the spin crossover phenomenon. In this work, a multiple spin state structure data set with 39,368 structures of the simplified heme–oxygen binding model was built from the non-adiabatic dynamic simulation trajectories. The ESP charges of each atom were calculated and used as the real-valued response. The conformational adapted charge model (CAC) of three spin states was constructed by an RFR model using symmetry functions. The results show that our RFR model can effectively predict the on the fly atomic charges with the varying conformations as well as the atomic charge of different spin states in the same conformation, thus achieving the balance of accuracy and efficiency. The average mean absolute error of the predicted charges of each spin state is <0.02 e. The comparison studies on descriptors showed a maximum 0.06 e improvement in prediction of the charge of Fe(2+) by using 11 manually selected structural parameters. We hope that this model can not only provide variable parameters for developing the force field of the multi-spin state but also facilitate automation, thus enabling large-scale simulations of atomistic systems. Frontiers Media S.A. 2020-03-31 /pmc/articles/PMC7136535/ /pubmed/32296675 http://dx.doi.org/10.3389/fchem.2020.00162 Text en Copyright © 2020 Zhao, Li, Huang, Bie and Gao. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Chemistry
Zhao, Wei
Li, Qing
Huang, Xian-Hui
Bie, Li-Hua
Gao, Jun
Toward the Prediction of Multi-Spin State Charges of a Heme Model by Random Forest Regression
title Toward the Prediction of Multi-Spin State Charges of a Heme Model by Random Forest Regression
title_full Toward the Prediction of Multi-Spin State Charges of a Heme Model by Random Forest Regression
title_fullStr Toward the Prediction of Multi-Spin State Charges of a Heme Model by Random Forest Regression
title_full_unstemmed Toward the Prediction of Multi-Spin State Charges of a Heme Model by Random Forest Regression
title_short Toward the Prediction of Multi-Spin State Charges of a Heme Model by Random Forest Regression
title_sort toward the prediction of multi-spin state charges of a heme model by random forest regression
topic Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7136535/
https://www.ncbi.nlm.nih.gov/pubmed/32296675
http://dx.doi.org/10.3389/fchem.2020.00162
work_keys_str_mv AT zhaowei towardthepredictionofmultispinstatechargesofahememodelbyrandomforestregression
AT liqing towardthepredictionofmultispinstatechargesofahememodelbyrandomforestregression
AT huangxianhui towardthepredictionofmultispinstatechargesofahememodelbyrandomforestregression
AT bielihua towardthepredictionofmultispinstatechargesofahememodelbyrandomforestregression
AT gaojun towardthepredictionofmultispinstatechargesofahememodelbyrandomforestregression