Cargando…
A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration
Variable (wavelength) selection is essential in the multivariate analysis of near-infrared spectra to improve model performance and provide a more straightforward interpretation. This paper proposed a new variable selection method named binning-normalized mutual information (B-NMI) based on informat...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10419756/ https://www.ncbi.nlm.nih.gov/pubmed/37570642 http://dx.doi.org/10.3390/molecules28155672 |
_version_ | 1785088602554761216 |
---|---|
author | Zhong, Liang Huang, Ruiqi Gao, Lele Yue, Jianan Zhao, Bing Nie, Lei Li, Lian Wu, Aoli Zhang, Kefan Meng, Zhaoqing Cao, Guiyun Zhang, Hui Zang, Hengchang |
author_facet | Zhong, Liang Huang, Ruiqi Gao, Lele Yue, Jianan Zhao, Bing Nie, Lei Li, Lian Wu, Aoli Zhang, Kefan Meng, Zhaoqing Cao, Guiyun Zhang, Hui Zang, Hengchang |
author_sort | Zhong, Liang |
collection | PubMed |
description | Variable (wavelength) selection is essential in the multivariate analysis of near-infrared spectra to improve model performance and provide a more straightforward interpretation. This paper proposed a new variable selection method named binning-normalized mutual information (B-NMI) based on information entropy theory. “Data binning” was applied to reduce the effects of minor measurement errors and increase the features of near-infrared spectra. “Normalized mutual information” was employed to calculate the correlation between each wavelength and the reference values. The performance of B-NMI was evaluated by two experimental datasets (ideal ternary solvent mixture dataset, fluidized bed granulation dataset) and two public datasets (gasoline octane dataset, corn protein dataset). Compared with classic methods of backward and interval PLS (BIPLS), variable importance projection (VIP), correlation coefficient (CC), uninformative variables elimination (UVE), and competitive adaptive reweighted sampling (CARS), B-NMI not only selected the most featured wavelengths from the spectra of complex real-world samples but also improved the stability and robustness of variable selection results. |
format | Online Article Text |
id | pubmed-10419756 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-104197562023-08-12 A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration Zhong, Liang Huang, Ruiqi Gao, Lele Yue, Jianan Zhao, Bing Nie, Lei Li, Lian Wu, Aoli Zhang, Kefan Meng, Zhaoqing Cao, Guiyun Zhang, Hui Zang, Hengchang Molecules Article Variable (wavelength) selection is essential in the multivariate analysis of near-infrared spectra to improve model performance and provide a more straightforward interpretation. This paper proposed a new variable selection method named binning-normalized mutual information (B-NMI) based on information entropy theory. “Data binning” was applied to reduce the effects of minor measurement errors and increase the features of near-infrared spectra. “Normalized mutual information” was employed to calculate the correlation between each wavelength and the reference values. The performance of B-NMI was evaluated by two experimental datasets (ideal ternary solvent mixture dataset, fluidized bed granulation dataset) and two public datasets (gasoline octane dataset, corn protein dataset). Compared with classic methods of backward and interval PLS (BIPLS), variable importance projection (VIP), correlation coefficient (CC), uninformative variables elimination (UVE), and competitive adaptive reweighted sampling (CARS), B-NMI not only selected the most featured wavelengths from the spectra of complex real-world samples but also improved the stability and robustness of variable selection results. MDPI 2023-07-26 /pmc/articles/PMC10419756/ /pubmed/37570642 http://dx.doi.org/10.3390/molecules28155672 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Zhong, Liang Huang, Ruiqi Gao, Lele Yue, Jianan Zhao, Bing Nie, Lei Li, Lian Wu, Aoli Zhang, Kefan Meng, Zhaoqing Cao, Guiyun Zhang, Hui Zang, Hengchang A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration |
title | A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration |
title_full | A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration |
title_fullStr | A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration |
title_full_unstemmed | A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration |
title_short | A Novel Variable Selection Method Based on Binning-Normalized Mutual Information for Multivariate Calibration |
title_sort | novel variable selection method based on binning-normalized mutual information for multivariate calibration |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10419756/ https://www.ncbi.nlm.nih.gov/pubmed/37570642 http://dx.doi.org/10.3390/molecules28155672 |
work_keys_str_mv | AT zhongliang anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT huangruiqi anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT gaolele anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT yuejianan anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT zhaobing anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT nielei anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT lilian anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT wuaoli anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT zhangkefan anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT mengzhaoqing anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT caoguiyun anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT zhanghui anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT zanghengchang anovelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT zhongliang novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT huangruiqi novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT gaolele novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT yuejianan novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT zhaobing novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT nielei novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT lilian novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT wuaoli novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT zhangkefan novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT mengzhaoqing novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT caoguiyun novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT zhanghui novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration AT zanghengchang novelvariableselectionmethodbasedonbinningnormalizedmutualinformationformultivariatecalibration |