Cargando…

Face-based age estimation using improved Swin Transformer with attention-based convolution

Recently Transformer models is new direction in the computer vision field, which is based on self multihead attention mechanism. Compared with the convolutional neural network, this Transformer uses the self-attention mechanism to capture global contextual information and extract more strong feature...

Descripción completa

Detalles Bibliográficos
Autores principales:	Shi, Chaojun, Zhao, Shiwei, Zhang, Ke, Wang, Yibo, Liang, Longping
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2023
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10130448/ https://www.ncbi.nlm.nih.gov/pubmed/37123378 http://dx.doi.org/10.3389/fnins.2023.1136934

_version_	1785030960128983040
author	Shi, Chaojun Zhao, Shiwei Zhang, Ke Wang, Yibo Liang, Longping
author_facet	Shi, Chaojun Zhao, Shiwei Zhang, Ke Wang, Yibo Liang, Longping
author_sort	Shi, Chaojun
collection	PubMed
description	Recently Transformer models is new direction in the computer vision field, which is based on self multihead attention mechanism. Compared with the convolutional neural network, this Transformer uses the self-attention mechanism to capture global contextual information and extract more strong features by learning the association relationship between different features, which has achieved good results in many vision tasks. In face-based age estimation, some facial patches that contain rich age-specific information are critical in the age estimation task. The present study proposed an attention-based convolution (ABC) age estimation framework, called improved Swin Transformer with ABC, in which two separate regions were implemented, namely ABC and Swin Transformer. ABC extracted facial patches containing rich age-specific information using a shallow convolutional network and a multiheaded attention mechanism. Subsequently, the features obtained by ABC were spliced with the flattened image in the Swin Transformer, which were then input to the Swin Transformer to predict the age of the image. The ABC framework spliced the important regions that contained rich age-specific information into the original image, which could fully mobilize the long-dependency of the Swin Transformer, that is, extracting stronger features by learning the dependency relationship between different features. ABC also introduced loss of diversity to guide the training of self-attention mechanism, reducing overlap between patches so that the diverse and important patches were discovered. Through extensive experiments, this study showed that the proposed framework outperformed several state-of-the-art methods on age estimation benchmark datasets.
format	Online Article Text
id	pubmed-10130448
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-101304482023-04-27 Face-based age estimation using improved Swin Transformer with attention-based convolution Shi, Chaojun Zhao, Shiwei Zhang, Ke Wang, Yibo Liang, Longping Front Neurosci Neuroscience Recently Transformer models is new direction in the computer vision field, which is based on self multihead attention mechanism. Compared with the convolutional neural network, this Transformer uses the self-attention mechanism to capture global contextual information and extract more strong features by learning the association relationship between different features, which has achieved good results in many vision tasks. In face-based age estimation, some facial patches that contain rich age-specific information are critical in the age estimation task. The present study proposed an attention-based convolution (ABC) age estimation framework, called improved Swin Transformer with ABC, in which two separate regions were implemented, namely ABC and Swin Transformer. ABC extracted facial patches containing rich age-specific information using a shallow convolutional network and a multiheaded attention mechanism. Subsequently, the features obtained by ABC were spliced with the flattened image in the Swin Transformer, which were then input to the Swin Transformer to predict the age of the image. The ABC framework spliced the important regions that contained rich age-specific information into the original image, which could fully mobilize the long-dependency of the Swin Transformer, that is, extracting stronger features by learning the dependency relationship between different features. ABC also introduced loss of diversity to guide the training of self-attention mechanism, reducing overlap between patches so that the diverse and important patches were discovered. Through extensive experiments, this study showed that the proposed framework outperformed several state-of-the-art methods on age estimation benchmark datasets. Frontiers Media S.A. 2023-04-12 /pmc/articles/PMC10130448/ /pubmed/37123378 http://dx.doi.org/10.3389/fnins.2023.1136934 Text en Copyright © 2023 Shi, Zhao, Zhang, Wang and Liang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neuroscience Shi, Chaojun Zhao, Shiwei Zhang, Ke Wang, Yibo Liang, Longping Face-based age estimation using improved Swin Transformer with attention-based convolution
title	Face-based age estimation using improved Swin Transformer with attention-based convolution
title_full	Face-based age estimation using improved Swin Transformer with attention-based convolution
title_fullStr	Face-based age estimation using improved Swin Transformer with attention-based convolution
title_full_unstemmed	Face-based age estimation using improved Swin Transformer with attention-based convolution
title_short	Face-based age estimation using improved Swin Transformer with attention-based convolution
title_sort	face-based age estimation using improved swin transformer with attention-based convolution
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10130448/ https://www.ncbi.nlm.nih.gov/pubmed/37123378 http://dx.doi.org/10.3389/fnins.2023.1136934
work_keys_str_mv	AT shichaojun facebasedageestimationusingimprovedswintransformerwithattentionbasedconvolution AT zhaoshiwei facebasedageestimationusingimprovedswintransformerwithattentionbasedconvolution AT zhangke facebasedageestimationusingimprovedswintransformerwithattentionbasedconvolution AT wangyibo facebasedageestimationusingimprovedswintransformerwithattentionbasedconvolution AT lianglongping facebasedageestimationusingimprovedswintransformerwithattentionbasedconvolution

Face-based age estimation using improved Swin Transformer with attention-based convolution

Ejemplares similares