Cargando…

A Long-Tailed Image Classification Method Based on Enhanced Contrastive Visual Language

To solve the problem that the common long-tailed classification method does not use the semantic features of the original label text of the image, and the difference between the classification accuracy of most classes and minority classes are large, the long-tailed image classification method based...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Ying, Li, Mengxing, Wang, Bo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10422492/
https://www.ncbi.nlm.nih.gov/pubmed/37571481
http://dx.doi.org/10.3390/s23156694
_version_ 1785089223785709568
author Song, Ying
Li, Mengxing
Wang, Bo
author_facet Song, Ying
Li, Mengxing
Wang, Bo
author_sort Song, Ying
collection PubMed
description To solve the problem that the common long-tailed classification method does not use the semantic features of the original label text of the image, and the difference between the classification accuracy of most classes and minority classes are large, the long-tailed image classification method based on enhanced contrast visual language trains the head class and tail class samples separately, uses text image to pre-train the information, and uses the enhanced momentum contrastive loss function and RandAugment enhancement to improve the learning of tail class samples. On the ImageNet-LT long-tailed dataset, the enhanced contrasting visual language-based long-tailed image classification method has improved all class accuracy, tail class accuracy, middle class accuracy, and the F(1) value by 3.4%, 7.6%, 3.5%, and 11.2%, respectively, compared to the BALLAD method. The difference in accuracy between the head class and tail class is reduced by 1.6% compared to the BALLAD method. The results of three comparative experiments indicate that the long-tailed image classification method based on enhanced contrastive visual language has improved the performance of tail classes and reduced the accuracy difference between the majority and minority classes.
format Online
Article
Text
id pubmed-10422492
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-104224922023-08-13 A Long-Tailed Image Classification Method Based on Enhanced Contrastive Visual Language Song, Ying Li, Mengxing Wang, Bo Sensors (Basel) Article To solve the problem that the common long-tailed classification method does not use the semantic features of the original label text of the image, and the difference between the classification accuracy of most classes and minority classes are large, the long-tailed image classification method based on enhanced contrast visual language trains the head class and tail class samples separately, uses text image to pre-train the information, and uses the enhanced momentum contrastive loss function and RandAugment enhancement to improve the learning of tail class samples. On the ImageNet-LT long-tailed dataset, the enhanced contrasting visual language-based long-tailed image classification method has improved all class accuracy, tail class accuracy, middle class accuracy, and the F(1) value by 3.4%, 7.6%, 3.5%, and 11.2%, respectively, compared to the BALLAD method. The difference in accuracy between the head class and tail class is reduced by 1.6% compared to the BALLAD method. The results of three comparative experiments indicate that the long-tailed image classification method based on enhanced contrastive visual language has improved the performance of tail classes and reduced the accuracy difference between the majority and minority classes. MDPI 2023-07-26 /pmc/articles/PMC10422492/ /pubmed/37571481 http://dx.doi.org/10.3390/s23156694 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Song, Ying
Li, Mengxing
Wang, Bo
A Long-Tailed Image Classification Method Based on Enhanced Contrastive Visual Language
title A Long-Tailed Image Classification Method Based on Enhanced Contrastive Visual Language
title_full A Long-Tailed Image Classification Method Based on Enhanced Contrastive Visual Language
title_fullStr A Long-Tailed Image Classification Method Based on Enhanced Contrastive Visual Language
title_full_unstemmed A Long-Tailed Image Classification Method Based on Enhanced Contrastive Visual Language
title_short A Long-Tailed Image Classification Method Based on Enhanced Contrastive Visual Language
title_sort long-tailed image classification method based on enhanced contrastive visual language
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10422492/
https://www.ncbi.nlm.nih.gov/pubmed/37571481
http://dx.doi.org/10.3390/s23156694
work_keys_str_mv AT songying alongtailedimageclassificationmethodbasedonenhancedcontrastivevisuallanguage
AT limengxing alongtailedimageclassificationmethodbasedonenhancedcontrastivevisuallanguage
AT wangbo alongtailedimageclassificationmethodbasedonenhancedcontrastivevisuallanguage
AT songying longtailedimageclassificationmethodbasedonenhancedcontrastivevisuallanguage
AT limengxing longtailedimageclassificationmethodbasedonenhancedcontrastivevisuallanguage
AT wangbo longtailedimageclassificationmethodbasedonenhancedcontrastivevisuallanguage