Cargando…

Hybrid Attention Cascade Network for Facial Expression Recognition

As a sub-challenge of EmotiW (the Emotion Recognition in the Wild challenge), how to improve performance on the AFEW (Acted Facial Expressions in the wild) dataset is a popular benchmark for emotion recognition tasks with various constraints, including uneven illumination, head deflection, and facia...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhu, Xiaoliang, Ye, Shihao, Zhao, Liang, Dai, Zhicheng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8002145/ https://www.ncbi.nlm.nih.gov/pubmed/33809038 http://dx.doi.org/10.3390/s21062003

_version_	1783671395232776192
author	Zhu, Xiaoliang Ye, Shihao Zhao, Liang Dai, Zhicheng
author_facet	Zhu, Xiaoliang Ye, Shihao Zhao, Liang Dai, Zhicheng
author_sort	Zhu, Xiaoliang
collection	PubMed
description	As a sub-challenge of EmotiW (the Emotion Recognition in the Wild challenge), how to improve performance on the AFEW (Acted Facial Expressions in the wild) dataset is a popular benchmark for emotion recognition tasks with various constraints, including uneven illumination, head deflection, and facial posture. In this paper, we propose a convenient facial expression recognition cascade network comprising spatial feature extraction, hybrid attention, and temporal feature extraction. First, in a video sequence, faces in each frame are detected, and the corresponding face ROI (range of interest) is extracted to obtain the face images. Then, the face images in each frame are aligned based on the position information of the facial feature points in the images. Second, the aligned face images are input to the residual neural network to extract the spatial features of facial expressions corresponding to the face images. The spatial features are input to the hybrid attention module to obtain the fusion features of facial expressions. Finally, the fusion features are input in the gate control loop unit to extract the temporal features of facial expressions. The temporal features are input to the fully connected layer to classify and recognize facial expressions. Experiments using the CK+ (the extended Cohn Kanade), Oulu-CASIA (Institute of Automation, Chinese Academy of Sciences) and AFEW datasets obtained recognition accuracy rates of 98.46%, 87.31%, and 53.44%, respectively. This demonstrated that the proposed method achieves not only competitive performance comparable to state-of-the-art methods but also greater than 2% performance improvement on the AFEW dataset, proving the significant outperformance of facial expression recognition in the natural environment.
format	Online Article Text
id	pubmed-8002145
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-80021452021-03-28 Hybrid Attention Cascade Network for Facial Expression Recognition Zhu, Xiaoliang Ye, Shihao Zhao, Liang Dai, Zhicheng Sensors (Basel) Article As a sub-challenge of EmotiW (the Emotion Recognition in the Wild challenge), how to improve performance on the AFEW (Acted Facial Expressions in the wild) dataset is a popular benchmark for emotion recognition tasks with various constraints, including uneven illumination, head deflection, and facial posture. In this paper, we propose a convenient facial expression recognition cascade network comprising spatial feature extraction, hybrid attention, and temporal feature extraction. First, in a video sequence, faces in each frame are detected, and the corresponding face ROI (range of interest) is extracted to obtain the face images. Then, the face images in each frame are aligned based on the position information of the facial feature points in the images. Second, the aligned face images are input to the residual neural network to extract the spatial features of facial expressions corresponding to the face images. The spatial features are input to the hybrid attention module to obtain the fusion features of facial expressions. Finally, the fusion features are input in the gate control loop unit to extract the temporal features of facial expressions. The temporal features are input to the fully connected layer to classify and recognize facial expressions. Experiments using the CK+ (the extended Cohn Kanade), Oulu-CASIA (Institute of Automation, Chinese Academy of Sciences) and AFEW datasets obtained recognition accuracy rates of 98.46%, 87.31%, and 53.44%, respectively. This demonstrated that the proposed method achieves not only competitive performance comparable to state-of-the-art methods but also greater than 2% performance improvement on the AFEW dataset, proving the significant outperformance of facial expression recognition in the natural environment. MDPI 2021-03-12 /pmc/articles/PMC8002145/ /pubmed/33809038 http://dx.doi.org/10.3390/s21062003 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Zhu, Xiaoliang Ye, Shihao Zhao, Liang Dai, Zhicheng Hybrid Attention Cascade Network for Facial Expression Recognition
title	Hybrid Attention Cascade Network for Facial Expression Recognition
title_full	Hybrid Attention Cascade Network for Facial Expression Recognition
title_fullStr	Hybrid Attention Cascade Network for Facial Expression Recognition
title_full_unstemmed	Hybrid Attention Cascade Network for Facial Expression Recognition
title_short	Hybrid Attention Cascade Network for Facial Expression Recognition
title_sort	hybrid attention cascade network for facial expression recognition
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8002145/ https://www.ncbi.nlm.nih.gov/pubmed/33809038 http://dx.doi.org/10.3390/s21062003
work_keys_str_mv	AT zhuxiaoliang hybridattentioncascadenetworkforfacialexpressionrecognition AT yeshihao hybridattentioncascadenetworkforfacialexpressionrecognition AT zhaoliang hybridattentioncascadenetworkforfacialexpressionrecognition AT daizhicheng hybridattentioncascadenetworkforfacialexpressionrecognition

Hybrid Attention Cascade Network for Facial Expression Recognition

Ejemplares similares