Cargando…

Data structure set-trie for storing and querying sets: Theoretical and empirical analysis

Set containment operations form an important tool in various fields such as information retrieval, AI systems, object-relational databases, and Internet applications. In the paper, a set-trie data structure for storing sets is considered, along with the efficient algorithms for the corresponding set...

Descripción completa

Detalles Bibliográficos
Autores principales: Savnik, Iztok, Akulich, Mikita, Krnc, Matjaž, Škrekovski, Riste
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7875400/
https://www.ncbi.nlm.nih.gov/pubmed/33566827
http://dx.doi.org/10.1371/journal.pone.0245122
Descripción
Sumario:Set containment operations form an important tool in various fields such as information retrieval, AI systems, object-relational databases, and Internet applications. In the paper, a set-trie data structure for storing sets is considered, along with the efficient algorithms for the corresponding set containment operations. We present the mathematical and empirical study of the set-trie. In the mathematical study, the relevant upper-bounds on the efficiency of its expected performance are established by utilizing a natural probabilistic model. In the empirical study, we give insight into how different distributions of input data impact the efficiency of set-trie. Using the correct parameters for those randomly generated datasets, we expose the key sources of the input sensitivity of set-trie. Finally, the empirical comparison of set-trie with the inverted index is based on the real-world datasets containing sets of low cardinality. The comparison shows that the running time of set-trie consistently outperforms the inverted index by orders of magnitude.