Cargando…
Data structure set-trie for storing and querying sets: Theoretical and empirical analysis
Set containment operations form an important tool in various fields such as information retrieval, AI systems, object-relational databases, and Internet applications. In the paper, a set-trie data structure for storing sets is considered, along with the efficient algorithms for the corresponding set...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7875400/ https://www.ncbi.nlm.nih.gov/pubmed/33566827 http://dx.doi.org/10.1371/journal.pone.0245122 |
_version_ | 1783649765884428288 |
---|---|
author | Savnik, Iztok Akulich, Mikita Krnc, Matjaž Škrekovski, Riste |
author_facet | Savnik, Iztok Akulich, Mikita Krnc, Matjaž Škrekovski, Riste |
author_sort | Savnik, Iztok |
collection | PubMed |
description | Set containment operations form an important tool in various fields such as information retrieval, AI systems, object-relational databases, and Internet applications. In the paper, a set-trie data structure for storing sets is considered, along with the efficient algorithms for the corresponding set containment operations. We present the mathematical and empirical study of the set-trie. In the mathematical study, the relevant upper-bounds on the efficiency of its expected performance are established by utilizing a natural probabilistic model. In the empirical study, we give insight into how different distributions of input data impact the efficiency of set-trie. Using the correct parameters for those randomly generated datasets, we expose the key sources of the input sensitivity of set-trie. Finally, the empirical comparison of set-trie with the inverted index is based on the real-world datasets containing sets of low cardinality. The comparison shows that the running time of set-trie consistently outperforms the inverted index by orders of magnitude. |
format | Online Article Text |
id | pubmed-7875400 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-78754002021-02-19 Data structure set-trie for storing and querying sets: Theoretical and empirical analysis Savnik, Iztok Akulich, Mikita Krnc, Matjaž Škrekovski, Riste PLoS One Research Article Set containment operations form an important tool in various fields such as information retrieval, AI systems, object-relational databases, and Internet applications. In the paper, a set-trie data structure for storing sets is considered, along with the efficient algorithms for the corresponding set containment operations. We present the mathematical and empirical study of the set-trie. In the mathematical study, the relevant upper-bounds on the efficiency of its expected performance are established by utilizing a natural probabilistic model. In the empirical study, we give insight into how different distributions of input data impact the efficiency of set-trie. Using the correct parameters for those randomly generated datasets, we expose the key sources of the input sensitivity of set-trie. Finally, the empirical comparison of set-trie with the inverted index is based on the real-world datasets containing sets of low cardinality. The comparison shows that the running time of set-trie consistently outperforms the inverted index by orders of magnitude. Public Library of Science 2021-02-10 /pmc/articles/PMC7875400/ /pubmed/33566827 http://dx.doi.org/10.1371/journal.pone.0245122 Text en © 2021 Savnik et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Savnik, Iztok Akulich, Mikita Krnc, Matjaž Škrekovski, Riste Data structure set-trie for storing and querying sets: Theoretical and empirical analysis |
title | Data structure set-trie for storing and querying sets: Theoretical and empirical analysis |
title_full | Data structure set-trie for storing and querying sets: Theoretical and empirical analysis |
title_fullStr | Data structure set-trie for storing and querying sets: Theoretical and empirical analysis |
title_full_unstemmed | Data structure set-trie for storing and querying sets: Theoretical and empirical analysis |
title_short | Data structure set-trie for storing and querying sets: Theoretical and empirical analysis |
title_sort | data structure set-trie for storing and querying sets: theoretical and empirical analysis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7875400/ https://www.ncbi.nlm.nih.gov/pubmed/33566827 http://dx.doi.org/10.1371/journal.pone.0245122 |
work_keys_str_mv | AT savnikiztok datastructuresettrieforstoringandqueryingsetstheoreticalandempiricalanalysis AT akulichmikita datastructuresettrieforstoringandqueryingsetstheoreticalandempiricalanalysis AT krncmatjaz datastructuresettrieforstoringandqueryingsetstheoreticalandempiricalanalysis AT skrekovskiriste datastructuresettrieforstoringandqueryingsetstheoreticalandempiricalanalysis |