Cargando…

Hybrid Granularities Transformer for Fine-Grained Image Recognition

Many current approaches for image classification concentrate solely on the most prominent features within an image, but in fine-grained image recognition, even subtle features can play a significant role in model classification. In addition, the large variations in the same class and small differenc...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yu, Ying, Wang, Jinghui
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10137422/ https://www.ncbi.nlm.nih.gov/pubmed/37190389 http://dx.doi.org/10.3390/e25040601

_version_	1785032459505631232
author	Yu, Ying Wang, Jinghui
author_facet	Yu, Ying Wang, Jinghui
author_sort	Yu, Ying
collection	PubMed
description	Many current approaches for image classification concentrate solely on the most prominent features within an image, but in fine-grained image recognition, even subtle features can play a significant role in model classification. In addition, the large variations in the same class and small differences between different categories that are unique to fine-grained image recognition pose a great challenge for the model to extract discriminative features between different categories. Therefore, we aim to present two lightweight modules to help the network discover more detailed information in this paper. (1) Patches Hidden Integrator (PHI) module randomly selects patches from images and replaces them with patches from other images of the same class. It allows the network to glean diverse discriminative region information and prevent over-reliance on a single feature, which can lead to misclassification. Additionally, it does not increase the training time. (2) Consistency Feature Learning (CFL) aggregates patch tokens from the last layer, mining local feature information and fusing it with the class token for classification. CFL also utilizes inconsistency loss to force the network to learn common features in both tokens, thereby guiding the network to focus on salient regions. We conducted experiments on three datasets, CUB-200-2011, Stanford Dogs, and Oxford 102 Flowers. We achieved experimental results of 91.6%, 92.7%, and 99.5%, respectively, achieving a competitive performance compared to other works.
format	Online Article Text
id	pubmed-10137422
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-101374222023-04-28 Hybrid Granularities Transformer for Fine-Grained Image Recognition Yu, Ying Wang, Jinghui Entropy (Basel) Article Many current approaches for image classification concentrate solely on the most prominent features within an image, but in fine-grained image recognition, even subtle features can play a significant role in model classification. In addition, the large variations in the same class and small differences between different categories that are unique to fine-grained image recognition pose a great challenge for the model to extract discriminative features between different categories. Therefore, we aim to present two lightweight modules to help the network discover more detailed information in this paper. (1) Patches Hidden Integrator (PHI) module randomly selects patches from images and replaces them with patches from other images of the same class. It allows the network to glean diverse discriminative region information and prevent over-reliance on a single feature, which can lead to misclassification. Additionally, it does not increase the training time. (2) Consistency Feature Learning (CFL) aggregates patch tokens from the last layer, mining local feature information and fusing it with the class token for classification. CFL also utilizes inconsistency loss to force the network to learn common features in both tokens, thereby guiding the network to focus on salient regions. We conducted experiments on three datasets, CUB-200-2011, Stanford Dogs, and Oxford 102 Flowers. We achieved experimental results of 91.6%, 92.7%, and 99.5%, respectively, achieving a competitive performance compared to other works. MDPI 2023-04-01 /pmc/articles/PMC10137422/ /pubmed/37190389 http://dx.doi.org/10.3390/e25040601 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Yu, Ying Wang, Jinghui Hybrid Granularities Transformer for Fine-Grained Image Recognition
title	Hybrid Granularities Transformer for Fine-Grained Image Recognition
title_full	Hybrid Granularities Transformer for Fine-Grained Image Recognition
title_fullStr	Hybrid Granularities Transformer for Fine-Grained Image Recognition
title_full_unstemmed	Hybrid Granularities Transformer for Fine-Grained Image Recognition
title_short	Hybrid Granularities Transformer for Fine-Grained Image Recognition
title_sort	hybrid granularities transformer for fine-grained image recognition
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10137422/ https://www.ncbi.nlm.nih.gov/pubmed/37190389 http://dx.doi.org/10.3390/e25040601
work_keys_str_mv	AT yuying hybridgranularitiestransformerforfinegrainedimagerecognition AT wangjinghui hybridgranularitiestransformerforfinegrainedimagerecognition

Hybrid Granularities Transformer for Fine-Grained Image Recognition

Ejemplares similares