Cargando…

Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features

Near-duplicate image retrieval is a classical research problem in computer vision toward many applications such as image annotation and content-based image retrieval. On the web, near-duplication is more prevalent in queries for celebrities and historical figures which are of particular interest to...

Descripción completa

Detalles Bibliográficos
Autores principales: Qiao, Fengcai, Wang, Cheng, Zhang, Xin, Wang, Hui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3791809/
https://www.ncbi.nlm.nih.gov/pubmed/24163631
http://dx.doi.org/10.1155/2013/795408
_version_ 1782286758840369152
author Qiao, Fengcai
Wang, Cheng
Zhang, Xin
Wang, Hui
author_facet Qiao, Fengcai
Wang, Cheng
Zhang, Xin
Wang, Hui
author_sort Qiao, Fengcai
collection PubMed
description Near-duplicate image retrieval is a classical research problem in computer vision toward many applications such as image annotation and content-based image retrieval. On the web, near-duplication is more prevalent in queries for celebrities and historical figures which are of particular interest to the end users. Existing methods such as bag-of-visual-words (BoVW) solve this problem mainly by exploiting purely visual features. To overcome this limitation, this paper proposes a novel text-based data-driven reranking framework, which utilizes textual features and is combined with state-of-art BoVW schemes. Under this framework, the input of the retrieval procedure is still only a query image. To verify the proposed approach, a dataset of 2 million images of 1089 different celebrities together with their accompanying texts is constructed. In addition, we comprehensively analyze the different categories of near duplication observed in our constructed dataset. Experimental results on this dataset show that the proposed framework can achieve higher mean average precision (mAP) with an improvement of 21% on average in comparison with the approaches based only on visual features, while does not notably prolong the retrieval time.
format Online
Article
Text
id pubmed-3791809
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-37918092013-10-27 Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features Qiao, Fengcai Wang, Cheng Zhang, Xin Wang, Hui ScientificWorldJournal Research Article Near-duplicate image retrieval is a classical research problem in computer vision toward many applications such as image annotation and content-based image retrieval. On the web, near-duplication is more prevalent in queries for celebrities and historical figures which are of particular interest to the end users. Existing methods such as bag-of-visual-words (BoVW) solve this problem mainly by exploiting purely visual features. To overcome this limitation, this paper proposes a novel text-based data-driven reranking framework, which utilizes textual features and is combined with state-of-art BoVW schemes. Under this framework, the input of the retrieval procedure is still only a query image. To verify the proposed approach, a dataset of 2 million images of 1089 different celebrities together with their accompanying texts is constructed. In addition, we comprehensively analyze the different categories of near duplication observed in our constructed dataset. Experimental results on this dataset show that the proposed framework can achieve higher mean average precision (mAP) with an improvement of 21% on average in comparison with the approaches based only on visual features, while does not notably prolong the retrieval time. Hindawi Publishing Corporation 2013-09-14 /pmc/articles/PMC3791809/ /pubmed/24163631 http://dx.doi.org/10.1155/2013/795408 Text en Copyright © 2013 Fengcai Qiao et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Qiao, Fengcai
Wang, Cheng
Zhang, Xin
Wang, Hui
Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features
title Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features
title_full Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features
title_fullStr Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features
title_full_unstemmed Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features
title_short Large Scale Near-Duplicate Celebrity Web Images Retrieval Using Visual and Textual Features
title_sort large scale near-duplicate celebrity web images retrieval using visual and textual features
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3791809/
https://www.ncbi.nlm.nih.gov/pubmed/24163631
http://dx.doi.org/10.1155/2013/795408
work_keys_str_mv AT qiaofengcai largescalenearduplicatecelebritywebimagesretrievalusingvisualandtextualfeatures
AT wangcheng largescalenearduplicatecelebritywebimagesretrievalusingvisualandtextualfeatures
AT zhangxin largescalenearduplicatecelebritywebimagesretrievalusingvisualandtextualfeatures
AT wanghui largescalenearduplicatecelebritywebimagesretrievalusingvisualandtextualfeatures