Cargando…

A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration

Variable (wavelength) selection is essential in the multivariate analysis of near-infrared spectra to improve model performance and provide a more straightforward interpretation. This paper proposed a new variable selection method named binning-normalized mutual information (B-NMI) based on informat...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhong, Liang, Huang, Ruiqi, Gao, Lele, Yue, Jianan, Zhao, Bing, Nie, Lei, Li, Lian, Wu, Aoli, Zhang, Kefan, Meng, Zhaoqing, Cao, Guiyun, Zhang, Hui, Zang, Hengchang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10419756/
https://www.ncbi.nlm.nih.gov/pubmed/37570642
http://dx.doi.org/10.3390/molecules28155672
_version_ 1785088602554761216
author Zhong, Liang
Huang, Ruiqi
Gao, Lele
Yue, Jianan
Zhao, Bing
Nie, Lei
Li, Lian
Wu, Aoli
Zhang, Kefan
Meng, Zhaoqing
Cao, Guiyun
Zhang, Hui
Zang, Hengchang
author_facet Zhong, Liang
Huang, Ruiqi
Gao, Lele
Yue, Jianan
Zhao, Bing
Nie, Lei
Li, Lian
Wu, Aoli
Zhang, Kefan
Meng, Zhaoqing
Cao, Guiyun
Zhang, Hui
Zang, Hengchang
author_sort Zhong, Liang
collection PubMed
description Variable (wavelength) selection is essential in the multivariate analysis of near-infrared spectra to improve model performance and provide a more straightforward interpretation. This paper proposed a new variable selection method named binning-normalized mutual information (B-NMI) based on information entropy theory. “Data binning” was applied to reduce the effects of minor measurement errors and increase the features of near-infrared spectra. “Normalized mutual information” was employed to calculate the correlation between each wavelength and the reference values. The performance of B-NMI was evaluated by two experimental datasets (ideal ternary solvent mixture dataset, fluidized bed granulation dataset) and two public datasets (gasoline octane dataset, corn protein dataset). Compared with classic methods of backward and interval PLS (BIPLS), variable importance projection (VIP), correlation coefficient (CC), uninformative variables elimination (UVE), and competitive adaptive reweighted sampling (CARS), B-NMI not only selected the most featured wavelengths from the spectra of complex real-world samples but also improved the stability and robustness of variable selection results.
format Online
Article
Text
id pubmed-10419756
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-104197562023-08-12 A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration Zhong, Liang Huang, Ruiqi Gao, Lele Yue, Jianan Zhao, Bing Nie, Lei Li, Lian Wu, Aoli Zhang, Kefan Meng, Zhaoqing Cao, Guiyun Zhang, Hui Zang, Hengchang Molecules Article Variable (wavelength) selection is essential in the multivariate analysis of near-infrared spectra to improve model performance and provide a more straightforward interpretation. This paper proposed a new variable selection method named binning-normalized mutual information (B-NMI) based on information entropy theory. “Data binning” was applied to reduce the effects of minor measurement errors and increase the features of near-infrared spectra. “Normalized mutual information” was employed to calculate the correlation between each wavelength and the reference values. The performance of B-NMI was evaluated by two experimental datasets (ideal ternary solvent mixture dataset, fluidized bed granulation dataset) and two public datasets (gasoline octane dataset, corn protein dataset). Compared with classic methods of backward and interval PLS (BIPLS), variable importance projection (VIP), correlation coefficient (CC), uninformative variables elimination (UVE), and competitive adaptive reweighted sampling (CARS), B-NMI not only selected the most featured wavelengths from the spectra of complex real-world samples but also improved the stability and robustness of variable selection results. MDPI 2023-07-26 /pmc/articles/PMC10419756/ /pubmed/37570642 http://dx.doi.org/10.3390/molecules28155672 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhong, Liang
Huang, Ruiqi
Gao, Lele
Yue, Jianan
Zhao, Bing
Nie, Lei
Li, Lian
Wu, Aoli
Zhang, Kefan
Meng, Zhaoqing
Cao, Guiyun
Zhang, Hui
Zang, Hengchang
A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration
title A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration
title_full A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration
title_fullStr A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration
title_full_unstemmed A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration
title_short A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration
title_sort novel variable selection method based on binning-normalized mutual information for multivariate calibration
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10419756/
https://www.ncbi.nlm.nih.gov/pubmed/37570642
http://dx.doi.org/10.3390/molecules28155672
work_keys_str_mv AT zhongliang anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT huangruiqi anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT gaolele anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT yuejianan anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT zhaobing anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT nielei anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT lilian anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT wuaoli anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT zhangkefan anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT mengzhaoqing anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT caoguiyun anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT zhanghui anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT zanghengchang anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT zhongliang novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT huangruiqi novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT gaolele novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT yuejianan novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT zhaobing novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT nielei novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT lilian novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT wuaoli novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT zhangkefan novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT mengzhaoqing novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT caoguiyun novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT zhanghui novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration
AT zanghengchang novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration