Cargando…

A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters

In high-dimensional spaces, accuracy and similarity search by low computing and storage costs are always difficult research topics, and there is a balance between efficiency and accuracy. In this paper, we propose a new structure Similar-PBF-PHT to represent items of a set with high dimensions and r...

Descripción completa

Detalles Bibliográficos
Autores principales: Shuai, Chunyan, Yang, Hengcheng, Ouyang, Xin, Li, Siqi, Chen, Zheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5174752/
https://www.ncbi.nlm.nih.gov/pubmed/28053603
http://dx.doi.org/10.1155/2016/4075257
_version_ 1782484552961228800
author Shuai, Chunyan
Yang, Hengcheng
Ouyang, Xin
Li, Siqi
Chen, Zheng
author_facet Shuai, Chunyan
Yang, Hengcheng
Ouyang, Xin
Li, Siqi
Chen, Zheng
author_sort Shuai, Chunyan
collection PubMed
description In high-dimensional spaces, accuracy and similarity search by low computing and storage costs are always difficult research topics, and there is a balance between efficiency and accuracy. In this paper, we propose a new structure Similar-PBF-PHT to represent items of a set with high dimensions and retrieve accurate and similar items. The Similar-PBF-PHT contains three parts: parallel bloom filters (PBFs), parallel hash tables (PHTs), and a bitmatrix. Experiments show that the Similar-PBF-PHT is effective in membership query and K-nearest neighbors (K-NN) search. With accurate querying, the Similar-PBF-PHT owns low hit false positive probability (FPP) and acceptable memory costs. With K-NN querying, the average overall ratio and rank-i ratio of the Hamming distance are accurate and ratios of the Euclidean distance are acceptable. It takes CPU time not I/O times to retrieve accurate and similar items and can deal with different data formats not only numerical values.
format Online
Article
Text
id pubmed-5174752
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-51747522017-01-04 A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters Shuai, Chunyan Yang, Hengcheng Ouyang, Xin Li, Siqi Chen, Zheng Comput Intell Neurosci Research Article In high-dimensional spaces, accuracy and similarity search by low computing and storage costs are always difficult research topics, and there is a balance between efficiency and accuracy. In this paper, we propose a new structure Similar-PBF-PHT to represent items of a set with high dimensions and retrieve accurate and similar items. The Similar-PBF-PHT contains three parts: parallel bloom filters (PBFs), parallel hash tables (PHTs), and a bitmatrix. Experiments show that the Similar-PBF-PHT is effective in membership query and K-nearest neighbors (K-NN) search. With accurate querying, the Similar-PBF-PHT owns low hit false positive probability (FPP) and acceptable memory costs. With K-NN querying, the average overall ratio and rank-i ratio of the Hamming distance are accurate and ratios of the Euclidean distance are acceptable. It takes CPU time not I/O times to retrieve accurate and similar items and can deal with different data formats not only numerical values. Hindawi Publishing Corporation 2016 2016-12-07 /pmc/articles/PMC5174752/ /pubmed/28053603 http://dx.doi.org/10.1155/2016/4075257 Text en Copyright © 2016 Chunyan Shuai et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Shuai, Chunyan
Yang, Hengcheng
Ouyang, Xin
Li, Siqi
Chen, Zheng
A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters
title A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters
title_full A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters
title_fullStr A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters
title_full_unstemmed A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters
title_short A Novel Accuracy and Similarity Search Structure Based on Parallel Bloom Filters
title_sort novel accuracy and similarity search structure based on parallel bloom filters
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5174752/
https://www.ncbi.nlm.nih.gov/pubmed/28053603
http://dx.doi.org/10.1155/2016/4075257
work_keys_str_mv AT shuaichunyan anovelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters
AT yanghengcheng anovelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters
AT ouyangxin anovelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters
AT lisiqi anovelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters
AT chenzheng anovelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters
AT shuaichunyan novelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters
AT yanghengcheng novelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters
AT ouyangxin novelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters
AT lisiqi novelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters
AT chenzheng novelaccuracyandsimilaritysearchstructurebasedonparallelbloomfilters