Cargando…

Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs

Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called “heterogeneous information networks” or HINs). Given a large graph and a query pattern with node...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Xiaofeng, Nicholson, Patrick K., Ajwani, Deepak, Riedewald, Mirek, Gatterbauer, Wolfgang, Sala, Alessandra
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6037532/
https://www.ncbi.nlm.nih.gov/pubmed/30003197
http://dx.doi.org/10.1145/3178876.3186115
_version_ 1783338339992076288
author Yang, Xiaofeng
Nicholson, Patrick K.
Ajwani, Deepak
Riedewald, Mirek
Gatterbauer, Wolfgang
Sala, Alessandra
author_facet Yang, Xiaofeng
Nicholson, Patrick K.
Ajwani, Deepak
Riedewald, Mirek
Gatterbauer, Wolfgang
Sala, Alessandra
author_sort Yang, Xiaofeng
collection PubMed
description Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called “heterogeneous information networks” or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to find the top-k matches according to a ranking function over edge and node weights. For users, it is difficult to select value k. We therefore propose the novel notion of an any-k ranking algorithm: for a given time budget, return as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continue until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges.
format Online
Article
Text
id pubmed-6037532
institution National Center for Biotechnology Information
language English
publishDate 2018
record_format MEDLINE/PubMed
spelling pubmed-60375322018-10-01 Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs Yang, Xiaofeng Nicholson, Patrick K. Ajwani, Deepak Riedewald, Mirek Gatterbauer, Wolfgang Sala, Alessandra Proc Int World Wide Web Conf Article Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called “heterogeneous information networks” or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to find the top-k matches according to a ranking function over edge and node weights. For users, it is difficult to select value k. We therefore propose the novel notion of an any-k ranking algorithm: for a given time budget, return as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continue until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges. 2018-04 /pmc/articles/PMC6037532/ /pubmed/30003197 http://dx.doi.org/10.1145/3178876.3186115 Text en This paper is published under the Creative Commons Attribution 4.0 International (CC BY 4.0) license (http://creativecommons.org/licenses/by-nc/4.0/) . Authors reserve their rights to disseminate the work on their personal and corporate Web sites with the appropriate attribution.
spellingShingle Article
Yang, Xiaofeng
Nicholson, Patrick K.
Ajwani, Deepak
Riedewald, Mirek
Gatterbauer, Wolfgang
Sala, Alessandra
Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs
title Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs
title_full Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs
title_fullStr Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs
title_full_unstemmed Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs
title_short Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs
title_sort any-k: anytime top-k tree pattern retrieval in labeled graphs
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6037532/
https://www.ncbi.nlm.nih.gov/pubmed/30003197
http://dx.doi.org/10.1145/3178876.3186115
work_keys_str_mv AT yangxiaofeng anykanytimetopktreepatternretrievalinlabeledgraphs
AT nicholsonpatrickk anykanytimetopktreepatternretrievalinlabeledgraphs
AT ajwanideepak anykanytimetopktreepatternretrievalinlabeledgraphs
AT riedewaldmirek anykanytimetopktreepatternretrievalinlabeledgraphs
AT gatterbauerwolfgang anykanytimetopktreepatternretrievalinlabeledgraphs
AT salaalessandra anykanytimetopktreepatternretrievalinlabeledgraphs