Cargando…

Robust Wavelength Selection Using Filter-Wrapper Method and Input Scaling on Near Infrared Spectral Data †

The extraction of relevant wavelengths from a large dataset of Near Infrared Spectroscopy (NIRS) is a significant challenge in vibrational spectroscopy research. Nonetheless, this process allows the improvement in the chemical interpretability by emphasizing the chemical entities related to the chem...

Descripción completa

Detalles Bibliográficos
Autores principales: Silalahi, Divo Dharma, Midi, Habshah, Arasan, Jayanthi, Mustafa, Mohd Shafie, Caliman, Jean-Pierre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506801/
https://www.ncbi.nlm.nih.gov/pubmed/32899292
http://dx.doi.org/10.3390/s20175001
_version_ 1783585097294807040
author Silalahi, Divo Dharma
Midi, Habshah
Arasan, Jayanthi
Mustafa, Mohd Shafie
Caliman, Jean-Pierre
author_facet Silalahi, Divo Dharma
Midi, Habshah
Arasan, Jayanthi
Mustafa, Mohd Shafie
Caliman, Jean-Pierre
author_sort Silalahi, Divo Dharma
collection PubMed
description The extraction of relevant wavelengths from a large dataset of Near Infrared Spectroscopy (NIRS) is a significant challenge in vibrational spectroscopy research. Nonetheless, this process allows the improvement in the chemical interpretability by emphasizing the chemical entities related to the chemical parameters of samples. With the complexity in the dataset, it may be possible that irrelevant wavelengths are still included in the multivariate calibration. This yields the computational process to become unnecessary complex and decreases the accuracy and robustness of the model. In multivariate analysis, Partial Least Square Regression (PLSR) is a method commonly used to build a predictive model from NIR spectral data. However, in the PLSR method and common commercial chemometrics software, there is no standard wavelength selection procedure applied to screen the irrelevant wavelengths. In this study, a new robust wavelength selection procedure called the modified VIP-MCUVE (mod-VIP-MCUVE) using Filter-Wrapper method and input scaling strategy is introduced. The proposed method combines the modified Variable Importance in Projection (VIP) and modified Monte Carlo Uninformative Variable Elimination (MCUVE) to calculate the scale matrix of the input variable. The modified VIP uses the orthogonal components of Partial Least Square (PLS) in investigating the informative variable in the model by applying the amount of variation both in [Formula: see text] and [Formula: see text] [Formula: see text] , simultaneously. The modified MCUVE uses a robust reliability coefficient and a robust tolerance interval in the selection procedure. To evaluate the superiority of the proposed method, the classical VIP, MCUVE, and autoscaling procedure in classical PLSR were also included in the evaluation. Using artificial data with Monte Carlo simulation and NIR spectral data of oil palm (Elaeis guineensis Jacq.) fruit mesocarp, the study shows that the proposed method offers advantages to improve model interpretability, to be computationally extensive, and to produce better model accuracy.
format Online
Article
Text
id pubmed-7506801
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75068012020-09-26 Robust Wavelength Selection Using Filter-Wrapper Method and Input Scaling on Near Infrared Spectral Data † Silalahi, Divo Dharma Midi, Habshah Arasan, Jayanthi Mustafa, Mohd Shafie Caliman, Jean-Pierre Sensors (Basel) Article The extraction of relevant wavelengths from a large dataset of Near Infrared Spectroscopy (NIRS) is a significant challenge in vibrational spectroscopy research. Nonetheless, this process allows the improvement in the chemical interpretability by emphasizing the chemical entities related to the chemical parameters of samples. With the complexity in the dataset, it may be possible that irrelevant wavelengths are still included in the multivariate calibration. This yields the computational process to become unnecessary complex and decreases the accuracy and robustness of the model. In multivariate analysis, Partial Least Square Regression (PLSR) is a method commonly used to build a predictive model from NIR spectral data. However, in the PLSR method and common commercial chemometrics software, there is no standard wavelength selection procedure applied to screen the irrelevant wavelengths. In this study, a new robust wavelength selection procedure called the modified VIP-MCUVE (mod-VIP-MCUVE) using Filter-Wrapper method and input scaling strategy is introduced. The proposed method combines the modified Variable Importance in Projection (VIP) and modified Monte Carlo Uninformative Variable Elimination (MCUVE) to calculate the scale matrix of the input variable. The modified VIP uses the orthogonal components of Partial Least Square (PLS) in investigating the informative variable in the model by applying the amount of variation both in [Formula: see text] and [Formula: see text] [Formula: see text] , simultaneously. The modified MCUVE uses a robust reliability coefficient and a robust tolerance interval in the selection procedure. To evaluate the superiority of the proposed method, the classical VIP, MCUVE, and autoscaling procedure in classical PLSR were also included in the evaluation. Using artificial data with Monte Carlo simulation and NIR spectral data of oil palm (Elaeis guineensis Jacq.) fruit mesocarp, the study shows that the proposed method offers advantages to improve model interpretability, to be computationally extensive, and to produce better model accuracy. MDPI 2020-09-03 /pmc/articles/PMC7506801/ /pubmed/32899292 http://dx.doi.org/10.3390/s20175001 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Silalahi, Divo Dharma
Midi, Habshah
Arasan, Jayanthi
Mustafa, Mohd Shafie
Caliman, Jean-Pierre
Robust Wavelength Selection Using Filter-Wrapper Method and Input Scaling on Near Infrared Spectral Data †
title Robust Wavelength Selection Using Filter-Wrapper Method and Input Scaling on Near Infrared Spectral Data †
title_full Robust Wavelength Selection Using Filter-Wrapper Method and Input Scaling on Near Infrared Spectral Data †
title_fullStr Robust Wavelength Selection Using Filter-Wrapper Method and Input Scaling on Near Infrared Spectral Data †
title_full_unstemmed Robust Wavelength Selection Using Filter-Wrapper Method and Input Scaling on Near Infrared Spectral Data †
title_short Robust Wavelength Selection Using Filter-Wrapper Method and Input Scaling on Near Infrared Spectral Data †
title_sort robust wavelength selection using filter-wrapper method and input scaling on near infrared spectral data †
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506801/
https://www.ncbi.nlm.nih.gov/pubmed/32899292
http://dx.doi.org/10.3390/s20175001
work_keys_str_mv AT silalahidivodharma robustwavelengthselectionusingfilterwrappermethodandinputscalingonnearinfraredspectraldata
AT midihabshah robustwavelengthselectionusingfilterwrappermethodandinputscalingonnearinfraredspectraldata
AT arasanjayanthi robustwavelengthselectionusingfilterwrappermethodandinputscalingonnearinfraredspectraldata
AT mustafamohdshafie robustwavelengthselectionusingfilterwrappermethodandinputscalingonnearinfraredspectraldata
AT calimanjeanpierre robustwavelengthselectionusingfilterwrappermethodandinputscalingonnearinfraredspectraldata