Cargando…

Pooling region learning of visual word for image classification using bag-of-visual-words model

In the problem where there is not enough data to use Deep Learning, Bag-of-Visual-Words (BoVW) is still a good alternative for image classification. In BoVW model, many pooling methods are proposed to incorporate the spatial information of local feature into the image representation vector, but none...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Ye, Yu, Xiaodong, Wang, Tian, Xu, Zezhong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7274423/
https://www.ncbi.nlm.nih.gov/pubmed/32502224
http://dx.doi.org/10.1371/journal.pone.0234144
_version_ 1783542579389792256
author Xu, Ye
Yu, Xiaodong
Wang, Tian
Xu, Zezhong
author_facet Xu, Ye
Yu, Xiaodong
Wang, Tian
Xu, Zezhong
author_sort Xu, Ye
collection PubMed
description In the problem where there is not enough data to use Deep Learning, Bag-of-Visual-Words (BoVW) is still a good alternative for image classification. In BoVW model, many pooling methods are proposed to incorporate the spatial information of local feature into the image representation vector, but none of the methods devote to making each visual word have its own pooling regions. The practice of designing the same pooling regions for all the words restrains the discriminability of image representation, since the spatial distributions of the local features indexed by different visual words are not same. In this paper, we propose to make each visual word have its own pooling regions, and raise a simple yet effective method for learning pooling region. Concretely, a kind of small window named observation window is used to obtain its responses to each word over the whole image region. The pooling regions of each word are organized by a kind of tree structure, in which each node indicates a pooling region. For each word, its pooling regions are learned by constructing a tree with its labelled coordinate data. The labelled coordinate data consist of the coordinates of responses and image class labels. The effectiveness of our method is validated by observing if there is an obvious classification accuracy improvement after applying our method. Our experimental results on four small datasets (i.e., Scene-15, Caltech-101, Caltech-256 and Corel-10) show that, the classification accuracy is improved by about 1% to 2.5%. We experimentally demonstrate that the practice of making each word have its own pooling regions is beneficial to image classification task, which is the significance of our work.
format Online
Article
Text
id pubmed-7274423
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-72744232020-06-09 Pooling region learning of visual word for image classification using bag-of-visual-words model Xu, Ye Yu, Xiaodong Wang, Tian Xu, Zezhong PLoS One Research Article In the problem where there is not enough data to use Deep Learning, Bag-of-Visual-Words (BoVW) is still a good alternative for image classification. In BoVW model, many pooling methods are proposed to incorporate the spatial information of local feature into the image representation vector, but none of the methods devote to making each visual word have its own pooling regions. The practice of designing the same pooling regions for all the words restrains the discriminability of image representation, since the spatial distributions of the local features indexed by different visual words are not same. In this paper, we propose to make each visual word have its own pooling regions, and raise a simple yet effective method for learning pooling region. Concretely, a kind of small window named observation window is used to obtain its responses to each word over the whole image region. The pooling regions of each word are organized by a kind of tree structure, in which each node indicates a pooling region. For each word, its pooling regions are learned by constructing a tree with its labelled coordinate data. The labelled coordinate data consist of the coordinates of responses and image class labels. The effectiveness of our method is validated by observing if there is an obvious classification accuracy improvement after applying our method. Our experimental results on four small datasets (i.e., Scene-15, Caltech-101, Caltech-256 and Corel-10) show that, the classification accuracy is improved by about 1% to 2.5%. We experimentally demonstrate that the practice of making each word have its own pooling regions is beneficial to image classification task, which is the significance of our work. Public Library of Science 2020-06-05 /pmc/articles/PMC7274423/ /pubmed/32502224 http://dx.doi.org/10.1371/journal.pone.0234144 Text en © 2020 Xu et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Xu, Ye
Yu, Xiaodong
Wang, Tian
Xu, Zezhong
Pooling region learning of visual word for image classification using bag-of-visual-words model
title Pooling region learning of visual word for image classification using bag-of-visual-words model
title_full Pooling region learning of visual word for image classification using bag-of-visual-words model
title_fullStr Pooling region learning of visual word for image classification using bag-of-visual-words model
title_full_unstemmed Pooling region learning of visual word for image classification using bag-of-visual-words model
title_short Pooling region learning of visual word for image classification using bag-of-visual-words model
title_sort pooling region learning of visual word for image classification using bag-of-visual-words model
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7274423/
https://www.ncbi.nlm.nih.gov/pubmed/32502224
http://dx.doi.org/10.1371/journal.pone.0234144
work_keys_str_mv AT xuye poolingregionlearningofvisualwordforimageclassificationusingbagofvisualwordsmodel
AT yuxiaodong poolingregionlearningofvisualwordforimageclassificationusingbagofvisualwordsmodel
AT wangtian poolingregionlearningofvisualwordforimageclassificationusingbagofvisualwordsmodel
AT xuzezhong poolingregionlearningofvisualwordforimageclassificationusingbagofvisualwordsmodel