Cargando…

Efficient Fine Tuning for Fashion Object Detection

Pre-trained models have achieved success in object detection. However, challenges remain due to dataset noise and lack of domain-specific data, resulting in weaker zero-shot capabilities in specialized fields such as fashion imaging. We addressed this by constructing a novel clothing object detectio...

Descripción completa

Detalles Bibliográficos
Autores principales: Ma, Benjiang, Xu, Wenjin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10346465/
https://www.ncbi.nlm.nih.gov/pubmed/37447935
http://dx.doi.org/10.3390/s23136083
_version_ 1785073319830093824
author Ma, Benjiang
Xu, Wenjin
author_facet Ma, Benjiang
Xu, Wenjin
author_sort Ma, Benjiang
collection PubMed
description Pre-trained models have achieved success in object detection. However, challenges remain due to dataset noise and lack of domain-specific data, resulting in weaker zero-shot capabilities in specialized fields such as fashion imaging. We addressed this by constructing a novel clothing object detection benchmark, Garment40K, which includes more than 140,000 human images with bounding boxes and over 40,000 clothing images. Each clothing item within this dataset is accompanied by its corresponding category and textual description. The dataset covers 2 major categories, pants and tops, which are further divided into 15 fine-grained subclasses, providing a rich and high-quality clothing resource. Leveraging this dataset, we propose an efficient fine-tuning method based on the Grounding DINO framework to tackle the issue of missed and false detections of clothing targets. This method incorporates additional similarity loss constraints and adapter modules, leading to a significantly enhanced model named Improved Grounding DINO. By fine-tuning only a small number of additional adapter module parameters, we considerably reduced computational costs while achieving performance comparable to full parameter fine tuning. This allows our model to be conveniently deployed on a variety of low-cost visual sensors. Our Improved Grounding DINO demonstrates considerable performance improvements in computer vision applications in the clothing domain.
format Online
Article
Text
id pubmed-10346465
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-103464652023-07-15 Efficient Fine Tuning for Fashion Object Detection Ma, Benjiang Xu, Wenjin Sensors (Basel) Article Pre-trained models have achieved success in object detection. However, challenges remain due to dataset noise and lack of domain-specific data, resulting in weaker zero-shot capabilities in specialized fields such as fashion imaging. We addressed this by constructing a novel clothing object detection benchmark, Garment40K, which includes more than 140,000 human images with bounding boxes and over 40,000 clothing images. Each clothing item within this dataset is accompanied by its corresponding category and textual description. The dataset covers 2 major categories, pants and tops, which are further divided into 15 fine-grained subclasses, providing a rich and high-quality clothing resource. Leveraging this dataset, we propose an efficient fine-tuning method based on the Grounding DINO framework to tackle the issue of missed and false detections of clothing targets. This method incorporates additional similarity loss constraints and adapter modules, leading to a significantly enhanced model named Improved Grounding DINO. By fine-tuning only a small number of additional adapter module parameters, we considerably reduced computational costs while achieving performance comparable to full parameter fine tuning. This allows our model to be conveniently deployed on a variety of low-cost visual sensors. Our Improved Grounding DINO demonstrates considerable performance improvements in computer vision applications in the clothing domain. MDPI 2023-07-01 /pmc/articles/PMC10346465/ /pubmed/37447935 http://dx.doi.org/10.3390/s23136083 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Ma, Benjiang
Xu, Wenjin
Efficient Fine Tuning for Fashion Object Detection
title Efficient Fine Tuning for Fashion Object Detection
title_full Efficient Fine Tuning for Fashion Object Detection
title_fullStr Efficient Fine Tuning for Fashion Object Detection
title_full_unstemmed Efficient Fine Tuning for Fashion Object Detection
title_short Efficient Fine Tuning for Fashion Object Detection
title_sort efficient fine tuning for fashion object detection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10346465/
https://www.ncbi.nlm.nih.gov/pubmed/37447935
http://dx.doi.org/10.3390/s23136083
work_keys_str_mv AT mabenjiang efficientfinetuningforfashionobjectdetection
AT xuwenjin efficientfinetuningforfashionobjectdetection