Cargando…

Facial expression recognition in videos using hybrid CNN & ConvLSTM

The three-dimensional convolutional neural network (3D-CNN) and long short-term memory (LSTM) have consistently outperformed many approaches in video-based facial expression recognition (VFER). The image is unrolled to a one-dimensional vector by the vanilla version of the fully-connected LSTM (FC-L...

Descripción completa

Detalles Bibliográficos
Autores principales: Singh, Rajesh, Saurav, Sumeet, Kumar, Tarun, Saini, Ravi, Vohra, Anil, Singh, Sanjay
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Nature Singapore 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028317/
https://www.ncbi.nlm.nih.gov/pubmed/37256027
http://dx.doi.org/10.1007/s41870-023-01183-0
_version_ 1784909923274981376
author Singh, Rajesh
Saurav, Sumeet
Kumar, Tarun
Saini, Ravi
Vohra, Anil
Singh, Sanjay
author_facet Singh, Rajesh
Saurav, Sumeet
Kumar, Tarun
Saini, Ravi
Vohra, Anil
Singh, Sanjay
author_sort Singh, Rajesh
collection PubMed
description The three-dimensional convolutional neural network (3D-CNN) and long short-term memory (LSTM) have consistently outperformed many approaches in video-based facial expression recognition (VFER). The image is unrolled to a one-dimensional vector by the vanilla version of the fully-connected LSTM (FC-LSTM), which leads to the loss of crucial spatial information. Convolutional LSTM (ConvLSTM) overcomes this limitation by performing LSTM operations in convolutions without unrolling, thus retaining useful spatial information. Motivated by this, in this paper, we propose a neural network architecture that consists of a blend of 3D-CNN and ConvLSTM for VFER. The proposed hybrid architecture captures spatiotemporal information from the video sequences of emotions and attains competitive accuracy on three FER datasets open to the public, namely the SAVEE, CK + , and AFEW. The experimental results demonstrate excellent performance without external emotional data with the added advantage of having a simple model with fewer parameters. Moreover, unlike the state-of-the-art deep learning models, our designed FER pipeline improves execution speed by many factors while achieving competitive recognition accuracy. Hence, the proposed FER pipeline is an appropriate candidate for recognizing facial expressions on resource-limited embedded platforms for real-time applications.
format Online
Article
Text
id pubmed-10028317
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Springer Nature Singapore
record_format MEDLINE/PubMed
spelling pubmed-100283172023-03-21 Facial expression recognition in videos using hybrid CNN & ConvLSTM Singh, Rajesh Saurav, Sumeet Kumar, Tarun Saini, Ravi Vohra, Anil Singh, Sanjay Int J Inf Technol Original Research The three-dimensional convolutional neural network (3D-CNN) and long short-term memory (LSTM) have consistently outperformed many approaches in video-based facial expression recognition (VFER). The image is unrolled to a one-dimensional vector by the vanilla version of the fully-connected LSTM (FC-LSTM), which leads to the loss of crucial spatial information. Convolutional LSTM (ConvLSTM) overcomes this limitation by performing LSTM operations in convolutions without unrolling, thus retaining useful spatial information. Motivated by this, in this paper, we propose a neural network architecture that consists of a blend of 3D-CNN and ConvLSTM for VFER. The proposed hybrid architecture captures spatiotemporal information from the video sequences of emotions and attains competitive accuracy on three FER datasets open to the public, namely the SAVEE, CK + , and AFEW. The experimental results demonstrate excellent performance without external emotional data with the added advantage of having a simple model with fewer parameters. Moreover, unlike the state-of-the-art deep learning models, our designed FER pipeline improves execution speed by many factors while achieving competitive recognition accuracy. Hence, the proposed FER pipeline is an appropriate candidate for recognizing facial expressions on resource-limited embedded platforms for real-time applications. Springer Nature Singapore 2023-03-21 2023 /pmc/articles/PMC10028317/ /pubmed/37256027 http://dx.doi.org/10.1007/s41870-023-01183-0 Text en © The Author(s), under exclusive licence to Bharati Vidyapeeth's Institute of Computer Applications and Management 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Research
Singh, Rajesh
Saurav, Sumeet
Kumar, Tarun
Saini, Ravi
Vohra, Anil
Singh, Sanjay
Facial expression recognition in videos using hybrid CNN & ConvLSTM
title Facial expression recognition in videos using hybrid CNN & ConvLSTM
title_full Facial expression recognition in videos using hybrid CNN & ConvLSTM
title_fullStr Facial expression recognition in videos using hybrid CNN & ConvLSTM
title_full_unstemmed Facial expression recognition in videos using hybrid CNN & ConvLSTM
title_short Facial expression recognition in videos using hybrid CNN & ConvLSTM
title_sort facial expression recognition in videos using hybrid cnn & convlstm
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028317/
https://www.ncbi.nlm.nih.gov/pubmed/37256027
http://dx.doi.org/10.1007/s41870-023-01183-0
work_keys_str_mv AT singhrajesh facialexpressionrecognitioninvideosusinghybridcnnconvlstm
AT sauravsumeet facialexpressionrecognitioninvideosusinghybridcnnconvlstm
AT kumartarun facialexpressionrecognitioninvideosusinghybridcnnconvlstm
AT sainiravi facialexpressionrecognitioninvideosusinghybridcnnconvlstm
AT vohraanil facialexpressionrecognitioninvideosusinghybridcnnconvlstm
AT singhsanjay facialexpressionrecognitioninvideosusinghybridcnnconvlstm