Can We Ditch Feature Engineering? End-to-End Deep Learning for Affect Recognition from Physiological Sensor Data

To further extend the applicability of wearable sensors in various domains such as mobile health systems and the automotive industry, new methods for accurately extracting subtle physiological information from these wearable sensors are required. However, the extraction of valuable information from...

Descripción completa

Detalles Bibliográficos
Autores principales: Dzieżyc, Maciej, Gjoreski, Martin, Kazienko, Przemysław, Saganowski, Stanisław, Gams, Matjaž
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7697590/
https://www.ncbi.nlm.nih.gov/pubmed/33207564
http://dx.doi.org/10.3390/s20226535
Descripción
Sumario:To further extend the applicability of wearable sensors in various domains such as mobile health systems and the automotive industry, new methods for accurately extracting subtle physiological information from these wearable sensors are required. However, the extraction of valuable information from physiological signals is still challenging—smartphones can count steps and compute heart rate, but they cannot recognize emotions and related affective states. This study analyzes the possibility of using end-to-end multimodal deep learning (DL) methods for affect recognition. Ten end-to-end DL architectures are compared on four different datasets with diverse raw physiological signals used for affect recognition, including emotional and stress states. The DL architectures specialized for time-series classification were enhanced to simultaneously facilitate learning from multiple sensors, each having their own sampling frequency. To enable fair comparison among the different DL architectures, Bayesian optimization was used for hyperparameter tuning. The experimental results showed that the performance of the models depends on the intensity of the physiological response induced by the affective stimuli, i.e., the DL models recognize stress induced by the Trier Social Stress Test more successfully than they recognize emotional changes induced by watching affective content, e.g., funny videos. Additionally, the results showed that the CNN-based architectures might be more suitable than LSTM-based architectures for affect recognition from physiological sensors.