Cargando…

Genetic Algorithm-Based Partial Least-Squares with Only the First Component for Model Interpretation

[Image: see text] In the fields of molecular design, material design, process design, and process control, it is important not only to construct models with high predictive ability between explanatory variables X and objective variables y but also to interpret the constructed models to clarify pheno...

Descripción completa

Detalles Bibliográficos
Autor principal: Kaneko, Hiromasa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8928558/
https://www.ncbi.nlm.nih.gov/pubmed/35309472
http://dx.doi.org/10.1021/acsomega.1c07379
Descripción
Sumario:[Image: see text] In the fields of molecular design, material design, process design, and process control, it is important not only to construct models with high predictive ability between explanatory variables X and objective variables y but also to interpret the constructed models to clarify phenomena and elucidate mechanisms in the fields. However, even in linear models, it is dangerous to use regression coefficients as contributions of X to y due to multicollinearity among X. Thus, the focus of this study is the model of partial least-squares with only the first component (PLSFC). It is possible to use regression coefficients as contributions of X to y for the PLSFC model. In addition, selecting the combination of X that can construct a predictive PLSFC model using a genetic algorithm (GA) is proposed, which is called GA-based PLSFC (GA-PLSFC). The constructed model would have both high predictive ability and high interpretability with regression coefficients that can be defined as contributions of X to y. The effectiveness of the proposed PLSFC and GA-PLSFC is verified using numerically simulated data sets and real material data sets. The proposed method was found to be capable of constructing predictive models with high interpretability. The Python codes for GA-PLSFC are available at https://github.com/hkaneko1985/dcekit.