Cargando…

Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems

Dealing with a system of first-order reactions is a recurrent issue in chemometrics, especially in the analysis of data obtained by spectroscopic methods applied on complex biological systems. We argue that global multiexponential fitting, the still common way to solve such problems, has serious wea...

Descripción completa

Detalles Bibliográficos
Autores principales: Zimányi, László, Sipos, Áron, Sarlós, Ferenc, Nagypál, Rita, Groma, Géza I.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8352076/
https://www.ncbi.nlm.nih.gov/pubmed/34370771
http://dx.doi.org/10.1371/journal.pone.0255675
_version_ 1783736105702522880
author Zimányi, László
Sipos, Áron
Sarlós, Ferenc
Nagypál, Rita
Groma, Géza I.
author_facet Zimányi, László
Sipos, Áron
Sarlós, Ferenc
Nagypál, Rita
Groma, Géza I.
author_sort Zimányi, László
collection PubMed
description Dealing with a system of first-order reactions is a recurrent issue in chemometrics, especially in the analysis of data obtained by spectroscopic methods applied on complex biological systems. We argue that global multiexponential fitting, the still common way to solve such problems, has serious weaknesses compared to contemporary methods of sparse modeling. Combining the advantages of group lasso and elastic net—the statistical methods proven to be very powerful in other areas—we created an optimization problem tunable from very sparse to very dense distribution over a large pre-defined grid of time constants, fitting both simulated and experimental multiwavelength spectroscopic data with high computational efficiency. We found that the optimal values of the tuning hyperparameters can be selected by a machine-learning algorithm based on a Bayesian optimization procedure, utilizing widely used or novel versions of cross-validation. The derived algorithm accurately recovered the true sparse kinetic parameters of an extremely complex simulated model of the bacteriorhodopsin photocycle, as well as the wide peak of hypothetical distributed kinetics in the presence of different noise levels. It also performed well in the analysis of the ultrafast experimental fluorescence kinetics data detected on the coenzyme FAD in a very wide logarithmic time window. We conclude that the primary application of the presented algorithms—implemented in available software—covers a wide area of studies on light-induced physical, chemical, and biological processes carried out with different spectroscopic methods. The demand for this kind of analysis is expected to soar due to the emerging ultrafast multidimensional infrared and electronic spectroscopic techniques that provide very large and complex datasets. In addition, simulations based on our methods could help in designing the technical parameters of future experiments for the verification of particular hypothetical models.
format Online
Article
Text
id pubmed-8352076
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-83520762021-08-10 Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems Zimányi, László Sipos, Áron Sarlós, Ferenc Nagypál, Rita Groma, Géza I. PLoS One Research Article Dealing with a system of first-order reactions is a recurrent issue in chemometrics, especially in the analysis of data obtained by spectroscopic methods applied on complex biological systems. We argue that global multiexponential fitting, the still common way to solve such problems, has serious weaknesses compared to contemporary methods of sparse modeling. Combining the advantages of group lasso and elastic net—the statistical methods proven to be very powerful in other areas—we created an optimization problem tunable from very sparse to very dense distribution over a large pre-defined grid of time constants, fitting both simulated and experimental multiwavelength spectroscopic data with high computational efficiency. We found that the optimal values of the tuning hyperparameters can be selected by a machine-learning algorithm based on a Bayesian optimization procedure, utilizing widely used or novel versions of cross-validation. The derived algorithm accurately recovered the true sparse kinetic parameters of an extremely complex simulated model of the bacteriorhodopsin photocycle, as well as the wide peak of hypothetical distributed kinetics in the presence of different noise levels. It also performed well in the analysis of the ultrafast experimental fluorescence kinetics data detected on the coenzyme FAD in a very wide logarithmic time window. We conclude that the primary application of the presented algorithms—implemented in available software—covers a wide area of studies on light-induced physical, chemical, and biological processes carried out with different spectroscopic methods. The demand for this kind of analysis is expected to soar due to the emerging ultrafast multidimensional infrared and electronic spectroscopic techniques that provide very large and complex datasets. In addition, simulations based on our methods could help in designing the technical parameters of future experiments for the verification of particular hypothetical models. Public Library of Science 2021-08-09 /pmc/articles/PMC8352076/ /pubmed/34370771 http://dx.doi.org/10.1371/journal.pone.0255675 Text en © 2021 Zimányi et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Zimányi, László
Sipos, Áron
Sarlós, Ferenc
Nagypál, Rita
Groma, Géza I.
Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
title Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
title_full Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
title_fullStr Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
title_full_unstemmed Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
title_short Machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
title_sort machine-learning model selection and parameter estimation from kinetic data of complex first-order reaction systems
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8352076/
https://www.ncbi.nlm.nih.gov/pubmed/34370771
http://dx.doi.org/10.1371/journal.pone.0255675
work_keys_str_mv AT zimanyilaszlo machinelearningmodelselectionandparameterestimationfromkineticdataofcomplexfirstorderreactionsystems
AT siposaron machinelearningmodelselectionandparameterestimationfromkineticdataofcomplexfirstorderreactionsystems
AT sarlosferenc machinelearningmodelselectionandparameterestimationfromkineticdataofcomplexfirstorderreactionsystems
AT nagypalrita machinelearningmodelselectionandparameterestimationfromkineticdataofcomplexfirstorderreactionsystems
AT gromagezai machinelearningmodelselectionandparameterestimationfromkineticdataofcomplexfirstorderreactionsystems