Cargando…

Elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning

The objective of the work is to develop an automated emotion recognition system specifically targeted to elderly people. A multi-modal system is developed which has integrated information from audio and video modalities. The database selected for experiments is ElderReact, which contains 1323 video...

Descripción completa

Detalles Bibliográficos
Autores principales: Sreevidya, P., Veni, S., Ramana Murthy, O. V.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer London 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8763433/
https://www.ncbi.nlm.nih.gov/pubmed/35069919
http://dx.doi.org/10.1007/s11760-021-02079-x
_version_ 1784633935025668096
author Sreevidya, P.
Veni, S.
Ramana Murthy, O. V.
author_facet Sreevidya, P.
Veni, S.
Ramana Murthy, O. V.
author_sort Sreevidya, P.
collection PubMed
description The objective of the work is to develop an automated emotion recognition system specifically targeted to elderly people. A multi-modal system is developed which has integrated information from audio and video modalities. The database selected for experiments is ElderReact, which contains 1323 video clips of 3 to 8 s duration of people above the age of 50. Here, all the six available emotions Disgust, Anger, Fear, Happiness, Sadness and Surprise are considered. In order to develop an automated emotion recognition system for aged adults, we attempted different modeling techniques. Features are extracted, and neural network models with backpropagation are attempted for developing the models. Further, for the raw video model, transfer learning from pretrained networks is attempted. Convolutional neural network and long short-time memory-based models were taken by maintaining the continuity in time between the frames while capturing the emotions. For the audio model, cross-model transfer learning is applied. Both the models are combined by fusion of intermediate layers. The layers are selected through a grid-based search algorithm. The accuracy and F1-score show that the proposed approach is outperforming the state-of-the-art results. Classification of all the images shows a minimum relative improvement of 6.5% for happiness to a maximum of 46% increase for sadness over the baseline results.
format Online
Article
Text
id pubmed-8763433
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer London
record_format MEDLINE/PubMed
spelling pubmed-87634332022-01-18 Elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning Sreevidya, P. Veni, S. Ramana Murthy, O. V. Signal Image Video Process Original Paper The objective of the work is to develop an automated emotion recognition system specifically targeted to elderly people. A multi-modal system is developed which has integrated information from audio and video modalities. The database selected for experiments is ElderReact, which contains 1323 video clips of 3 to 8 s duration of people above the age of 50. Here, all the six available emotions Disgust, Anger, Fear, Happiness, Sadness and Surprise are considered. In order to develop an automated emotion recognition system for aged adults, we attempted different modeling techniques. Features are extracted, and neural network models with backpropagation are attempted for developing the models. Further, for the raw video model, transfer learning from pretrained networks is attempted. Convolutional neural network and long short-time memory-based models were taken by maintaining the continuity in time between the frames while capturing the emotions. For the audio model, cross-model transfer learning is applied. Both the models are combined by fusion of intermediate layers. The layers are selected through a grid-based search algorithm. The accuracy and F1-score show that the proposed approach is outperforming the state-of-the-art results. Classification of all the images shows a minimum relative improvement of 6.5% for happiness to a maximum of 46% increase for sadness over the baseline results. Springer London 2022-01-18 2022 /pmc/articles/PMC8763433/ /pubmed/35069919 http://dx.doi.org/10.1007/s11760-021-02079-x Text en © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Paper
Sreevidya, P.
Veni, S.
Ramana Murthy, O. V.
Elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning
title Elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning
title_full Elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning
title_fullStr Elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning
title_full_unstemmed Elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning
title_short Elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning
title_sort elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8763433/
https://www.ncbi.nlm.nih.gov/pubmed/35069919
http://dx.doi.org/10.1007/s11760-021-02079-x
work_keys_str_mv AT sreevidyap elderemotionclassificationthroughmultimodalfusionofintermediatelayersandcrossmodaltransferlearning
AT venis elderemotionclassificationthroughmultimodalfusionofintermediatelayersandcrossmodaltransferlearning
AT ramanamurthyov elderemotionclassificationthroughmultimodalfusionofintermediatelayersandcrossmodaltransferlearning