Cargando…
Beyond Equi-joins: Ranking, Enumeration and Factorization
We study theta-joins in general and join predicates with conjunctions and disjunctions of inequalities in particular, focusing on ranked enumeration where the answers are returned incrementally in an order dictated by a given ranking function. Our approach achieves strong time and space complexity p...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9106312/ https://www.ncbi.nlm.nih.gov/pubmed/35573749 http://dx.doi.org/10.14778/3476249.3476306 |
_version_ | 1784708254262099968 |
---|---|
author | Tziavelis, Nikolaos Gatterbauer, Wolfgang Riedewald, Mirek |
author_facet | Tziavelis, Nikolaos Gatterbauer, Wolfgang Riedewald, Mirek |
author_sort | Tziavelis, Nikolaos |
collection | PubMed |
description | We study theta-joins in general and join predicates with conjunctions and disjunctions of inequalities in particular, focusing on ranked enumeration where the answers are returned incrementally in an order dictated by a given ranking function. Our approach achieves strong time and space complexity properties: with n denoting the number of tuples in the database, we guarantee for acyclic full join queries with inequality conditions that for every value of k, the k top-ranked answers are returned in [Formula: see text] time. This is within a polylogarithmic factor of [Formula: see text] , i.e., the best known complexity for equi-joins, and even of [Formula: see text] , i.e., the time it takes to look at the input and return k answers in any order. Our guarantees extend to join queries with selections and many types of projections (namely those called “free-connex” queries and those that use bag semantics). Remarkably, they hold even when the number of join results is n(ℓ) for a join of ℓ relations. The key ingredient is a novel [Formula: see text]-size factorized representation of the query output, which is constructed on-the-fly for a given query and database. In addition to providing the first non-trivial theoretical guarantees beyond equi-joins, we show in an experimental study that our ranked-enumeration approach is also memory-efficient and fast in practice, beating the running time of state-of-the-art database systems by orders of magnitude. |
format | Online Article Text |
id | pubmed-9106312 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
record_format | MEDLINE/PubMed |
spelling | pubmed-91063122022-05-13 Beyond Equi-joins: Ranking, Enumeration and Factorization Tziavelis, Nikolaos Gatterbauer, Wolfgang Riedewald, Mirek Proceedings VLDB Endowment Article We study theta-joins in general and join predicates with conjunctions and disjunctions of inequalities in particular, focusing on ranked enumeration where the answers are returned incrementally in an order dictated by a given ranking function. Our approach achieves strong time and space complexity properties: with n denoting the number of tuples in the database, we guarantee for acyclic full join queries with inequality conditions that for every value of k, the k top-ranked answers are returned in [Formula: see text] time. This is within a polylogarithmic factor of [Formula: see text] , i.e., the best known complexity for equi-joins, and even of [Formula: see text] , i.e., the time it takes to look at the input and return k answers in any order. Our guarantees extend to join queries with selections and many types of projections (namely those called “free-connex” queries and those that use bag semantics). Remarkably, they hold even when the number of join results is n(ℓ) for a join of ℓ relations. The key ingredient is a novel [Formula: see text]-size factorized representation of the query output, which is constructed on-the-fly for a given query and database. In addition to providing the first non-trivial theoretical guarantees beyond equi-joins, we show in an experimental study that our ranked-enumeration approach is also memory-efficient and fast in practice, beating the running time of state-of-the-art database systems by orders of magnitude. 2021-07 2021-10-27 /pmc/articles/PMC9106312/ /pubmed/35573749 http://dx.doi.org/10.14778/3476249.3476306 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under the Creative Commons BY-NC-ND 4.0 International License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of this license. For any use beyond those covered by this license, obtain permission by emailing info@vldb.org. |
spellingShingle | Article Tziavelis, Nikolaos Gatterbauer, Wolfgang Riedewald, Mirek Beyond Equi-joins: Ranking, Enumeration and Factorization |
title | Beyond Equi-joins: Ranking, Enumeration and Factorization |
title_full | Beyond Equi-joins: Ranking, Enumeration and Factorization |
title_fullStr | Beyond Equi-joins: Ranking, Enumeration and Factorization |
title_full_unstemmed | Beyond Equi-joins: Ranking, Enumeration and Factorization |
title_short | Beyond Equi-joins: Ranking, Enumeration and Factorization |
title_sort | beyond equi-joins: ranking, enumeration and factorization |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9106312/ https://www.ncbi.nlm.nih.gov/pubmed/35573749 http://dx.doi.org/10.14778/3476249.3476306 |
work_keys_str_mv | AT tziavelisnikolaos beyondequijoinsrankingenumerationandfactorization AT gatterbauerwolfgang beyondequijoinsrankingenumerationandfactorization AT riedewaldmirek beyondequijoinsrankingenumerationandfactorization |