Cargando…

A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets

BACKGROUND: A common problem in neurophysiological signal processing is the extraction of meaningful information from high dimension, low sample size data (HDLSS). We present RoLDSIS (regression on low-dimension spanned input space), a regression technique based on dimensionality reduction that cons...

Descripción completa

Detalles Bibliográficos
Autores principales: Santana, Adrielle C., Barbosa, Adriano V., Yehia, Hani C., Laboissière, Rafael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7780417/
https://www.ncbi.nlm.nih.gov/pubmed/33397293
http://dx.doi.org/10.1186/s12868-020-00605-0
_version_ 1783631499756568576
author Santana, Adrielle C.
Barbosa, Adriano V.
Yehia, Hani C.
Laboissière, Rafael
author_facet Santana, Adrielle C.
Barbosa, Adriano V.
Yehia, Hani C.
Laboissière, Rafael
author_sort Santana, Adrielle C.
collection PubMed
description BACKGROUND: A common problem in neurophysiological signal processing is the extraction of meaningful information from high dimension, low sample size data (HDLSS). We present RoLDSIS (regression on low-dimension spanned input space), a regression technique based on dimensionality reduction that constrains the solution to the subspace spanned by the available observations. This avoids regularization parameters in the regression procedure, as needed in shrinkage regression methods. RESULTS: We applied RoLDSIS to the EEG data collected in a phonemic identification experiment. In the experiment, morphed syllables in the continuum /da/–/ta/ were presented as acoustic stimuli to the participants and the event-related potentials (ERP) were recorded and then represented as a set of features in the time-frequency domain via the discrete wavelet transform. Each set of stimuli was chosen from a preliminary identification task executed by the participant. Physical and psychophysical attributes were associated to each stimulus. RoLDSIS was then used to infer the neurophysiological axes, in the feature space, associated with each attribute. We show that these axes can be reliably estimated and that their separation is correlated with the individual strength of phonemic categorization. The results provided by RoLDSIS are interpretable in the time-frequency domain and may be used to infer the neurophysiological correlates of phonemic categorization. A comparison with commonly used regularized regression techniques was carried out by cross-validation. CONCLUSION: The prediction errors obtained by RoLDSIS are comparable to those obtained with Ridge Regression and smaller than those obtained with LASSO and SPLS. However, RoLDSIS achieves this without the need for cross-validation, a procedure that requires the extraction of a large amount of observations from the data and, consequently, a decreased signal-to-noise ratio when averaging trials. We show that, even though RoLDSIS is a simple technique, it is suitable for the processing and interpretation of neurophysiological signals.
format Online
Article
Text
id pubmed-7780417
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-77804172021-01-05 A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets Santana, Adrielle C. Barbosa, Adriano V. Yehia, Hani C. Laboissière, Rafael BMC Neurosci Methodology Article BACKGROUND: A common problem in neurophysiological signal processing is the extraction of meaningful information from high dimension, low sample size data (HDLSS). We present RoLDSIS (regression on low-dimension spanned input space), a regression technique based on dimensionality reduction that constrains the solution to the subspace spanned by the available observations. This avoids regularization parameters in the regression procedure, as needed in shrinkage regression methods. RESULTS: We applied RoLDSIS to the EEG data collected in a phonemic identification experiment. In the experiment, morphed syllables in the continuum /da/–/ta/ were presented as acoustic stimuli to the participants and the event-related potentials (ERP) were recorded and then represented as a set of features in the time-frequency domain via the discrete wavelet transform. Each set of stimuli was chosen from a preliminary identification task executed by the participant. Physical and psychophysical attributes were associated to each stimulus. RoLDSIS was then used to infer the neurophysiological axes, in the feature space, associated with each attribute. We show that these axes can be reliably estimated and that their separation is correlated with the individual strength of phonemic categorization. The results provided by RoLDSIS are interpretable in the time-frequency domain and may be used to infer the neurophysiological correlates of phonemic categorization. A comparison with commonly used regularized regression techniques was carried out by cross-validation. CONCLUSION: The prediction errors obtained by RoLDSIS are comparable to those obtained with Ridge Regression and smaller than those obtained with LASSO and SPLS. However, RoLDSIS achieves this without the need for cross-validation, a procedure that requires the extraction of a large amount of observations from the data and, consequently, a decreased signal-to-noise ratio when averaging trials. We show that, even though RoLDSIS is a simple technique, it is suitable for the processing and interpretation of neurophysiological signals. BioMed Central 2021-01-04 /pmc/articles/PMC7780417/ /pubmed/33397293 http://dx.doi.org/10.1186/s12868-020-00605-0 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Methodology Article
Santana, Adrielle C.
Barbosa, Adriano V.
Yehia, Hani C.
Laboissière, Rafael
A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
title A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
title_full A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
title_fullStr A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
title_full_unstemmed A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
title_short A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
title_sort dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7780417/
https://www.ncbi.nlm.nih.gov/pubmed/33397293
http://dx.doi.org/10.1186/s12868-020-00605-0
work_keys_str_mv AT santanaadriellec adimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT barbosaadrianov adimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT yehiahanic adimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT laboissiererafael adimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT santanaadriellec dimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT barbosaadrianov dimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT yehiahanic dimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets
AT laboissiererafael dimensionreductiontechniqueappliedtoregressiononhighdimensionlowsamplesizeneurophysiologicaldatasets