Cargando…
A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters
In high-dimensional spaces, accuracy and similarity search by low computing and storage costs are always difficult research topics, and there is a balance between efficiency and accuracy. In this paper, we propose a new structure Similar-PBF-PHT to represent items of a set with high dimensions and r...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5174752/ https://www.ncbi.nlm.nih.gov/pubmed/28053603 http://dx.doi.org/10.1155/2016/4075257 |
_version_ | 1782484552961228800 |
---|---|
author | Shuai, Chunyan Yang, Hengcheng Ouyang, Xin Li, Siqi Chen, Zheng |
author_facet | Shuai, Chunyan Yang, Hengcheng Ouyang, Xin Li, Siqi Chen, Zheng |
author_sort | Shuai, Chunyan |
collection | PubMed |
description | In high-dimensional spaces, accuracy and similarity search by low computing and storage costs are always difficult research topics, and there is a balance between efficiency and accuracy. In this paper, we propose a new structure Similar-PBF-PHT to represent items of a set with high dimensions and retrieve accurate and similar items. The Similar-PBF-PHT contains three parts: parallel bloom filters (PBFs), parallel hash tables (PHTs), and a bitmatrix. Experiments show that the Similar-PBF-PHT is effective in membership query and K-nearest neighbors (K-NN) search. With accurate querying, the Similar-PBF-PHT owns low hit false positive probability (FPP) and acceptable memory costs. With K-NN querying, the average overall ratio and rank-i ratio of the Hamming distance are accurate and ratios of the Euclidean distance are acceptable. It takes CPU time not I/O times to retrieve accurate and similar items and can deal with different data formats not only numerical values. |
format | Online Article Text |
id | pubmed-5174752 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-51747522017-01-04 A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters Shuai, Chunyan Yang, Hengcheng Ouyang, Xin Li, Siqi Chen, Zheng Comput Intell Neurosci Research Article In high-dimensional spaces, accuracy and similarity search by low computing and storage costs are always difficult research topics, and there is a balance between efficiency and accuracy. In this paper, we propose a new structure Similar-PBF-PHT to represent items of a set with high dimensions and retrieve accurate and similar items. The Similar-PBF-PHT contains three parts: parallel bloom filters (PBFs), parallel hash tables (PHTs), and a bitmatrix. Experiments show that the Similar-PBF-PHT is effective in membership query and K-nearest neighbors (K-NN) search. With accurate querying, the Similar-PBF-PHT owns low hit false positive probability (FPP) and acceptable memory costs. With K-NN querying, the average overall ratio and rank-i ratio of the Hamming distance are accurate and ratios of the Euclidean distance are acceptable. It takes CPU time not I/O times to retrieve accurate and similar items and can deal with different data formats not only numerical values. Hindawi Publishing Corporation 2016 2016-12-07 /pmc/articles/PMC5174752/ /pubmed/28053603 http://dx.doi.org/10.1155/2016/4075257 Text en Copyright © 2016 Chunyan Shuai et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Shuai, Chunyan Yang, Hengcheng Ouyang, Xin Li, Siqi Chen, Zheng A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters |
title | A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters |
title_full | A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters |
title_fullStr | A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters |
title_full_unstemmed | A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters |
title_short | A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters |
title_sort | novel accuracy and similarity search structure based on parallel bloom filters |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5174752/ https://www.ncbi.nlm.nih.gov/pubmed/28053603 http://dx.doi.org/10.1155/2016/4075257 |
work_keys_str_mv | AT shuaichunyan anovelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters AT yanghengcheng anovelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters AT ouyangxin anovelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters AT lisiqi anovelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters AT chenzheng anovelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters AT shuaichunyan novelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters AT yanghengcheng novelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters AT ouyangxin novelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters AT lisiqi novelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters AT chenzheng novelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters |