Cargando…
Pooling region learning of visual word for image classification using bag-of-visual-words model
In the problem where there is not enough data to use Deep Learning, Bag-of-Visual-Words (BoVW) is still a good alternative for image classification. In BoVW model, many pooling methods are proposed to incorporate the spatial information of local feature into the image representation vector, but none...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7274423/ https://www.ncbi.nlm.nih.gov/pubmed/32502224 http://dx.doi.org/10.1371/journal.pone.0234144 |
_version_ | 1783542579389792256 |
---|---|
author | Xu, Ye Yu, Xiaodong Wang, Tian Xu, Zezhong |
author_facet | Xu, Ye Yu, Xiaodong Wang, Tian Xu, Zezhong |
author_sort | Xu, Ye |
collection | PubMed |
description | In the problem where there is not enough data to use Deep Learning, Bag-of-Visual-Words (BoVW) is still a good alternative for image classification. In BoVW model, many pooling methods are proposed to incorporate the spatial information of local feature into the image representation vector, but none of the methods devote to making each visual word have its own pooling regions. The practice of designing the same pooling regions for all the words restrains the discriminability of image representation, since the spatial distributions of the local features indexed by different visual words are not same. In this paper, we propose to make each visual word have its own pooling regions, and raise a simple yet effective method for learning pooling region. Concretely, a kind of small window named observation window is used to obtain its responses to each word over the whole image region. The pooling regions of each word are organized by a kind of tree structure, in which each node indicates a pooling region. For each word, its pooling regions are learned by constructing a tree with its labelled coordinate data. The labelled coordinate data consist of the coordinates of responses and image class labels. The effectiveness of our method is validated by observing if there is an obvious classification accuracy improvement after applying our method. Our experimental results on four small datasets (i.e., Scene-15, Caltech-101, Caltech-256 and Corel-10) show that, the classification accuracy is improved by about 1% to 2.5%. We experimentally demonstrate that the practice of making each word have its own pooling regions is beneficial to image classification task, which is the significance of our work. |
format | Online Article Text |
id | pubmed-7274423 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-72744232020-06-09 Pooling region learning of visual word for image classification using bag-of-visual-words model Xu, Ye Yu, Xiaodong Wang, Tian Xu, Zezhong PLoS One Research Article In the problem where there is not enough data to use Deep Learning, Bag-of-Visual-Words (BoVW) is still a good alternative for image classification. In BoVW model, many pooling methods are proposed to incorporate the spatial information of local feature into the image representation vector, but none of the methods devote to making each visual word have its own pooling regions. The practice of designing the same pooling regions for all the words restrains the discriminability of image representation, since the spatial distributions of the local features indexed by different visual words are not same. In this paper, we propose to make each visual word have its own pooling regions, and raise a simple yet effective method for learning pooling region. Concretely, a kind of small window named observation window is used to obtain its responses to each word over the whole image region. The pooling regions of each word are organized by a kind of tree structure, in which each node indicates a pooling region. For each word, its pooling regions are learned by constructing a tree with its labelled coordinate data. The labelled coordinate data consist of the coordinates of responses and image class labels. The effectiveness of our method is validated by observing if there is an obvious classification accuracy improvement after applying our method. Our experimental results on four small datasets (i.e., Scene-15, Caltech-101, Caltech-256 and Corel-10) show that, the classification accuracy is improved by about 1% to 2.5%. We experimentally demonstrate that the practice of making each word have its own pooling regions is beneficial to image classification task, which is the significance of our work. Public Library of Science 2020-06-05 /pmc/articles/PMC7274423/ /pubmed/32502224 http://dx.doi.org/10.1371/journal.pone.0234144 Text en © 2020 Xu et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Xu, Ye Yu, Xiaodong Wang, Tian Xu, Zezhong Pooling region learning of visual word for image classification using bag-of-visual-words model |
title | Pooling region learning of visual word for image classification using bag-of-visual-words model |
title_full | Pooling region learning of visual word for image classification using bag-of-visual-words model |
title_fullStr | Pooling region learning of visual word for image classification using bag-of-visual-words model |
title_full_unstemmed | Pooling region learning of visual word for image classification using bag-of-visual-words model |
title_short | Pooling region learning of visual word for image classification using bag-of-visual-words model |
title_sort | pooling region learning of visual word for image classification using bag-of-visual-words model |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7274423/ https://www.ncbi.nlm.nih.gov/pubmed/32502224 http://dx.doi.org/10.1371/journal.pone.0234144 |
work_keys_str_mv | AT xuye poolingregionlearningofvisualwordforimageclassificationusingbagofvisualwordsmodel AT yuxiaodong poolingregionlearningofvisualwordforimageclassificationusingbagofvisualwordsmodel AT wangtian poolingregionlearningofvisualwordforimageclassificationusingbagofvisualwordsmodel AT xuzezhong poolingregionlearningofvisualwordforimageclassificationusingbagofvisualwordsmodel |