Cargando…

Systematic selection of chemical fingerprint features improves the Gibbs energy prediction of biochemical reactions

MOTIVATION: Accurate and wide-ranging prediction of thermodynamic parameters for biochemical reactions can facilitate deeper insights into the workings and the design of metabolic systems. RESULTS: Here, we introduce a machine learning method with chemical fingerprint-based features for the predicti...

Descripción completa

Detalles Bibliográficos
Autores principales: Alazmi, Meshari, Kuwahara, Hiroyuki, Soufan, Othman, Ding, Lizhong, Gao, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6662295/
https://www.ncbi.nlm.nih.gov/pubmed/30590445
http://dx.doi.org/10.1093/bioinformatics/bty1035
Descripción
Sumario:MOTIVATION: Accurate and wide-ranging prediction of thermodynamic parameters for biochemical reactions can facilitate deeper insights into the workings and the design of metabolic systems. RESULTS: Here, we introduce a machine learning method with chemical fingerprint-based features for the prediction of the Gibbs free energy of biochemical reactions. From a large pool of 2D fingerprint-based features, this method systematically selects a small number of relevant ones and uses them to construct a regularized linear model. Since a manual selection of 2D structure-based features can be a tedious and time-consuming task, requiring expert knowledge about the structure-activity relationship of chemical compounds, the systematic feature selection step in our method offers a convenient means to identify relevant 2D fingerprint-based features. By comparing our method with state-of-the-art linear regression-based methods for the standard Gibbs free energy prediction, we demonstrated that its prediction accuracy and prediction coverage are most favorable. Our results show direct evidence that a number of 2D fingerprints collectively provide useful information about the Gibbs free energy of biochemical reactions and that our systematic feature selection procedure provides a convenient way to identify them. AVAILABILITY AND IMPLEMENTATION: Our software is freely available for download at http://sfb.kaust.edu.sa/Pages/Software.aspx. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.