Cargando…

Facial Expression Recognition Based on Fine-Tuned Channel–Spatial Attention Transformer

Facial expressions help individuals convey their emotions. In recent years, thanks to the development of computer vision technology, facial expression recognition (FER) has become a research hotspot and made remarkable progress. However, human faces in real-world environments are affected by various...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yao, Huang, Yang, Xiaomeng, Chen, Di, Wang, Zhao, Tian, Yuan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10422316/ https://www.ncbi.nlm.nih.gov/pubmed/37571582 http://dx.doi.org/10.3390/s23156799

_version_	1785089179138392064
author	Yao, Huang Yang, Xiaomeng Chen, Di Wang, Zhao Tian, Yuan
author_facet	Yao, Huang Yang, Xiaomeng Chen, Di Wang, Zhao Tian, Yuan
author_sort	Yao, Huang
collection	PubMed
description	Facial expressions help individuals convey their emotions. In recent years, thanks to the development of computer vision technology, facial expression recognition (FER) has become a research hotspot and made remarkable progress. However, human faces in real-world environments are affected by various unfavorable factors, such as facial occlusion and head pose changes, which are seldom encountered in controlled laboratory settings. These factors often lead to a reduction in expression recognition accuracy. Inspired by the recent success of transformers in many computer vision tasks, we propose a model called the fine-tuned channel–spatial attention transformer (FT-CSAT) to improve the accuracy of recognition of FER in the wild. FT-CSAT consists of two crucial components: channel–spatial attention module and fine-tuning module. In the channel–spatial attention module, the feature map is input into the channel attention module and the spatial attention module sequentially. The final output feature map will effectively incorporate both channel information and spatial information. Consequently, the network becomes adept at focusing on relevant and meaningful features associated with facial expressions. To further improve the model’s performance while controlling the number of excessive parameters, we employ a fine-tuning method. Extensive experimental results demonstrate that our FT-CSAT outperforms the state-of-the-art methods on two benchmark datasets: RAF-DB and FERPlus. The achieved recognition accuracy is 88.61% and 89.26%, respectively. Furthermore, to evaluate the robustness of FT-CSAT in the case of facial occlusion and head pose changes, we take tests on Occlusion-RAF-DB and Pose-RAF-DB data sets, and the results also show that the superior recognition performance of the proposed method under such conditions.
format	Online Article Text
id	pubmed-10422316
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-104223162023-08-13 Facial Expression Recognition Based on Fine-Tuned Channel–Spatial Attention Transformer Yao, Huang Yang, Xiaomeng Chen, Di Wang, Zhao Tian, Yuan Sensors (Basel) Article Facial expressions help individuals convey their emotions. In recent years, thanks to the development of computer vision technology, facial expression recognition (FER) has become a research hotspot and made remarkable progress. However, human faces in real-world environments are affected by various unfavorable factors, such as facial occlusion and head pose changes, which are seldom encountered in controlled laboratory settings. These factors often lead to a reduction in expression recognition accuracy. Inspired by the recent success of transformers in many computer vision tasks, we propose a model called the fine-tuned channel–spatial attention transformer (FT-CSAT) to improve the accuracy of recognition of FER in the wild. FT-CSAT consists of two crucial components: channel–spatial attention module and fine-tuning module. In the channel–spatial attention module, the feature map is input into the channel attention module and the spatial attention module sequentially. The final output feature map will effectively incorporate both channel information and spatial information. Consequently, the network becomes adept at focusing on relevant and meaningful features associated with facial expressions. To further improve the model’s performance while controlling the number of excessive parameters, we employ a fine-tuning method. Extensive experimental results demonstrate that our FT-CSAT outperforms the state-of-the-art methods on two benchmark datasets: RAF-DB and FERPlus. The achieved recognition accuracy is 88.61% and 89.26%, respectively. Furthermore, to evaluate the robustness of FT-CSAT in the case of facial occlusion and head pose changes, we take tests on Occlusion-RAF-DB and Pose-RAF-DB data sets, and the results also show that the superior recognition performance of the proposed method under such conditions. MDPI 2023-07-30 /pmc/articles/PMC10422316/ /pubmed/37571582 http://dx.doi.org/10.3390/s23156799 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Yao, Huang Yang, Xiaomeng Chen, Di Wang, Zhao Tian, Yuan Facial Expression Recognition Based on Fine-Tuned Channel–Spatial Attention Transformer
title	Facial Expression Recognition Based on Fine-Tuned Channel–Spatial Attention Transformer
title_full	Facial Expression Recognition Based on Fine-Tuned Channel–Spatial Attention Transformer
title_fullStr	Facial Expression Recognition Based on Fine-Tuned Channel–Spatial Attention Transformer
title_full_unstemmed	Facial Expression Recognition Based on Fine-Tuned Channel–Spatial Attention Transformer
title_short	Facial Expression Recognition Based on Fine-Tuned Channel–Spatial Attention Transformer
title_sort	facial expression recognition based on fine-tuned channel–spatial attention transformer
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10422316/ https://www.ncbi.nlm.nih.gov/pubmed/37571582 http://dx.doi.org/10.3390/s23156799
work_keys_str_mv	AT yaohuang facialexpressionrecognitionbasedonfinetunedchannelspatialattentiontransformer AT yangxiaomeng facialexpressionrecognitionbasedonfinetunedchannelspatialattentiontransformer AT chendi facialexpressionrecognitionbasedonfinetunedchannelspatialattentiontransformer AT wangzhao facialexpressionrecognitionbasedonfinetunedchannelspatialattentiontransformer AT tianyuan facialexpressionrecognitionbasedonfinetunedchannelspatialattentiontransformer

Facial Expression Recognition Based on Fine-Tuned Channel–Spatial Attention Transformer

Ejemplares similares