Cargando…

Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems

Dealing with a system of first-order reactions is a recurrent issue in chemometrics, especially in the analysis of data obtained by spectroscopic methods applied on complex biological systems. We argue that global multiexponential fitting, the still common way to solve such problems, has serious wea...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zimányi, László, Sipos, Áron, Sarlós, Ferenc, Nagypál, Rita, Groma, Géza I.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8352076/ https://www.ncbi.nlm.nih.gov/pubmed/34370771 http://dx.doi.org/10.1371/journal.pone.0255675

_version_	1783736105702522880
author	Zimányi, László Sipos, Áron Sarlós, Ferenc Nagypál, Rita Groma, Géza I.
author_facet	Zimányi, László Sipos, Áron Sarlós, Ferenc Nagypál, Rita Groma, Géza I.
author_sort	Zimányi, László
collection	PubMed
description	Dealing with a system of first-order reactions is a recurrent issue in chemometrics, especially in the analysis of data obtained by spectroscopic methods applied on complex biological systems. We argue that global multiexponential fitting, the still common way to solve such problems, has serious weaknesses compared to contemporary methods of sparse modeling. Combining the advantages of group lasso and elastic net—the statistical methods proven to be very powerful in other areas—we created an optimization problem tunable from very sparse to very dense distribution over a large pre-defined grid of time constants, fitting both simulated and experimental multiwavelength spectroscopic data with high computational efficiency. We found that the optimal values of the tuning hyperparameters can be selected by a machine-learning algorithm based on a Bayesian optimization procedure, utilizing widely used or novel versions of cross-validation. The derived algorithm accurately recovered the true sparse kinetic parameters of an extremely complex simulated model of the bacteriorhodopsin photocycle, as well as the wide peak of hypothetical distributed kinetics in the presence of different noise levels. It also performed well in the analysis of the ultrafast experimental fluorescence kinetics data detected on the coenzyme FAD in a very wide logarithmic time window. We conclude that the primary application of the presented algorithms—implemented in available software—covers a wide area of studies on light-induced physical, chemical, and biological processes carried out with different spectroscopic methods. The demand for this kind of analysis is expected to soar due to the emerging ultrafast multidimensional infrared and electronic spectroscopic techniques that provide very large and complex datasets. In addition, simulations based on our methods could help in designing the technical parameters of future experiments for the verification of particular hypothetical models.
format	Online Article Text
id	pubmed-8352076
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-83520762021-08-10 Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems Zimányi, László Sipos, Áron Sarlós, Ferenc Nagypál, Rita Groma, Géza I. PLoS One Research Article Dealing with a system of first-order reactions is a recurrent issue in chemometrics, especially in the analysis of data obtained by spectroscopic methods applied on complex biological systems. We argue that global multiexponential fitting, the still common way to solve such problems, has serious weaknesses compared to contemporary methods of sparse modeling. Combining the advantages of group lasso and elastic net—the statistical methods proven to be very powerful in other areas—we created an optimization problem tunable from very sparse to very dense distribution over a large pre-defined grid of time constants, fitting both simulated and experimental multiwavelength spectroscopic data with high computational efficiency. We found that the optimal values of the tuning hyperparameters can be selected by a machine-learning algorithm based on a Bayesian optimization procedure, utilizing widely used or novel versions of cross-validation. The derived algorithm accurately recovered the true sparse kinetic parameters of an extremely complex simulated model of the bacteriorhodopsin photocycle, as well as the wide peak of hypothetical distributed kinetics in the presence of different noise levels. It also performed well in the analysis of the ultrafast experimental fluorescence kinetics data detected on the coenzyme FAD in a very wide logarithmic time window. We conclude that the primary application of the presented algorithms—implemented in available software—covers a wide area of studies on light-induced physical, chemical, and biological processes carried out with different spectroscopic methods. The demand for this kind of analysis is expected to soar due to the emerging ultrafast multidimensional infrared and electronic spectroscopic techniques that provide very large and complex datasets. In addition, simulations based on our methods could help in designing the technical parameters of future experiments for the verification of particular hypothetical models. Public Library of Science 2021-08-09 /pmc/articles/PMC8352076/ /pubmed/34370771 http://dx.doi.org/10.1371/journal.pone.0255675 Text en © 2021 Zimányi et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Zimányi, László Sipos, Áron Sarlós, Ferenc Nagypál, Rita Groma, Géza I. Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
title	Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
title_full	Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
title_fullStr	Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
title_full_unstemmed	Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
title_short	Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
title_sort	machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8352076/ https://www.ncbi.nlm.nih.gov/pubmed/34370771 http://dx.doi.org/10.1371/journal.pone.0255675
work_keys_str_mv	AT zimanyilaszlo machinelearningmodelselectionandparameterestimationfromkineticdataofcomplexfirstorderreactionsystems AT siposaron machinelearningmodelselectionandparameterestimationfromkineticdataofcomplexfirstorderreactionsystems AT sarlosferenc machinelearningmodelselectionandparameterestimationfromkineticdataofcomplexfirstorderreactionsystems AT nagypalrita machinelearningmodelselectionandparameterestimationfromkineticdataofcomplexfirstorderreactionsystems AT gromagezai machinelearningmodelselectionandparameterestimationfromkineticdataofcomplexfirstorderreactionsystems

Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems

Ejemplares similares