Cargando…

A Long-Tailed Image Classification Method Based on Enhanced Contrastive Visual Language

To solve the problem that the common long-tailed classification method does not use the semantic features of the original label text of the image, and the difference between the classification accuracy of most classes and minority classes are large, the long-tailed image classification method based...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Ying, Li, Mengxing, Wang, Bo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10422492/
https://www.ncbi.nlm.nih.gov/pubmed/37571481
http://dx.doi.org/10.3390/s23156694
Descripción
Sumario:To solve the problem that the common long-tailed classification method does not use the semantic features of the original label text of the image, and the difference between the classification accuracy of most classes and minority classes are large, the long-tailed image classification method based on enhanced contrast visual language trains the head class and tail class samples separately, uses text image to pre-train the information, and uses the enhanced momentum contrastive loss function and RandAugment enhancement to improve the learning of tail class samples. On the ImageNet-LT long-tailed dataset, the enhanced contrasting visual language-based long-tailed image classification method has improved all class accuracy, tail class accuracy, middle class accuracy, and the F(1) value by 3.4%, 7.6%, 3.5%, and 11.2%, respectively, compared to the BALLAD method. The difference in accuracy between the head class and tail class is reduced by 1.6% compared to the BALLAD method. The results of three comparative experiments indicate that the long-tailed image classification method based on enhanced contrastive visual language has improved the performance of tail classes and reduced the accuracy difference between the majority and minority classes.