Cargando…
Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs
Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called “heterogeneous information networks” or HINs). Given a large graph and a query pattern with node...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6037532/ https://www.ncbi.nlm.nih.gov/pubmed/30003197 http://dx.doi.org/10.1145/3178876.3186115 |
_version_ | 1783338339992076288 |
---|---|
author | Yang, Xiaofeng Nicholson, Patrick K. Ajwani, Deepak Riedewald, Mirek Gatterbauer, Wolfgang Sala, Alessandra |
author_facet | Yang, Xiaofeng Nicholson, Patrick K. Ajwani, Deepak Riedewald, Mirek Gatterbauer, Wolfgang Sala, Alessandra |
author_sort | Yang, Xiaofeng |
collection | PubMed |
description | Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called “heterogeneous information networks” or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to find the top-k matches according to a ranking function over edge and node weights. For users, it is difficult to select value k. We therefore propose the novel notion of an any-k ranking algorithm: for a given time budget, return as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continue until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges. |
format | Online Article Text |
id | pubmed-6037532 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
record_format | MEDLINE/PubMed |
spelling | pubmed-60375322018-10-01 Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs Yang, Xiaofeng Nicholson, Patrick K. Ajwani, Deepak Riedewald, Mirek Gatterbauer, Wolfgang Sala, Alessandra Proc Int World Wide Web Conf Article Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called “heterogeneous information networks” or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to find the top-k matches according to a ranking function over edge and node weights. For users, it is difficult to select value k. We therefore propose the novel notion of an any-k ranking algorithm: for a given time budget, return as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continue until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges. 2018-04 /pmc/articles/PMC6037532/ /pubmed/30003197 http://dx.doi.org/10.1145/3178876.3186115 Text en This paper is published under the Creative Commons Attribution 4.0 International (CC BY 4.0) license (http://creativecommons.org/licenses/by-nc/4.0/) . Authors reserve their rights to disseminate the work on their personal and corporate Web sites with the appropriate attribution. |
spellingShingle | Article Yang, Xiaofeng Nicholson, Patrick K. Ajwani, Deepak Riedewald, Mirek Gatterbauer, Wolfgang Sala, Alessandra Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs |
title | Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs |
title_full | Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs |
title_fullStr | Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs |
title_full_unstemmed | Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs |
title_short | Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs |
title_sort | any-k: anytime top-k tree pattern retrieval in labeled graphs |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6037532/ https://www.ncbi.nlm.nih.gov/pubmed/30003197 http://dx.doi.org/10.1145/3178876.3186115 |
work_keys_str_mv | AT yangxiaofeng anykanytimetopktreepatternretrievalinlabeledgraphs AT nicholsonpatrickk anykanytimetopktreepatternretrievalinlabeledgraphs AT ajwanideepak anykanytimetopktreepatternretrievalinlabeledgraphs AT riedewaldmirek anykanytimetopktreepatternretrievalinlabeledgraphs AT gatterbauerwolfgang anykanytimetopktreepatternretrievalinlabeledgraphs AT salaalessandra anykanytimetopktreepatternretrievalinlabeledgraphs |