Cargando…

RINQ: Reference-based Indexing for Network Queries

We consider the problem of similarity queries in biological network databases. Given a database of networks, similarity query returns all the database networks whose similarity (i.e. alignment score) to a given query network is at least a specified similarity cutoff value. Alignment of two networks...

Descripción completa

Detalles Bibliográficos
Autores principales: Gülsoy, Günhan, Kahveci, Tamer
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117351/
https://www.ncbi.nlm.nih.gov/pubmed/21685064
http://dx.doi.org/10.1093/bioinformatics/btr203
_version_ 1782206320520200192
author Gülsoy, Günhan
Kahveci, Tamer
author_facet Gülsoy, Günhan
Kahveci, Tamer
author_sort Gülsoy, Günhan
collection PubMed
description We consider the problem of similarity queries in biological network databases. Given a database of networks, similarity query returns all the database networks whose similarity (i.e. alignment score) to a given query network is at least a specified similarity cutoff value. Alignment of two networks is a very costly operation, which makes exhaustive comparison of all the database networks with a query impractical. To tackle this problem, we develop a novel indexing method, named RINQ (Reference-based Indexing for Biological Network Queries). Our method uses a set of reference networks to eliminate a large portion of the database quickly for each query. A reference network is a small biological network. We precompute and store the alignments of all the references with all the database networks. When our database is queried, we align the query network with all the reference networks. Using these alignments, we calculate a lower bound and an approximate upper bound to the alignment score of each database network with the query network. With the help of upper and lower bounds, we eliminate the majority of the database networks without aligning them to the query network. We also quickly identify a small portion of these as guaranteed to be similar to the query. We perform pairwise alignment only for the remaining networks. We also propose a supervised method to pick references that have a large chance of filtering the unpromising database networks. Extensive experimental evaluation suggests that (i) our method reduced the running time of a single query on a database of around 300 networks from over 2 days to only 8 h; (ii) our method outperformed the state of the art method Closure Tree and SAGA by a factor of three or more; and (iii) our method successfully identified statistically and biologically significant relationships across networks and organisms. Contact: ggulsoy@cise.ufl.edu; tamer@cise.ufl.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3117351
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31173512011-06-17 RINQ: Reference-based Indexing for Network Queries Gülsoy, Günhan Kahveci, Tamer Bioinformatics Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria We consider the problem of similarity queries in biological network databases. Given a database of networks, similarity query returns all the database networks whose similarity (i.e. alignment score) to a given query network is at least a specified similarity cutoff value. Alignment of two networks is a very costly operation, which makes exhaustive comparison of all the database networks with a query impractical. To tackle this problem, we develop a novel indexing method, named RINQ (Reference-based Indexing for Biological Network Queries). Our method uses a set of reference networks to eliminate a large portion of the database quickly for each query. A reference network is a small biological network. We precompute and store the alignments of all the references with all the database networks. When our database is queried, we align the query network with all the reference networks. Using these alignments, we calculate a lower bound and an approximate upper bound to the alignment score of each database network with the query network. With the help of upper and lower bounds, we eliminate the majority of the database networks without aligning them to the query network. We also quickly identify a small portion of these as guaranteed to be similar to the query. We perform pairwise alignment only for the remaining networks. We also propose a supervised method to pick references that have a large chance of filtering the unpromising database networks. Extensive experimental evaluation suggests that (i) our method reduced the running time of a single query on a database of around 300 networks from over 2 days to only 8 h; (ii) our method outperformed the state of the art method Closure Tree and SAGA by a factor of three or more; and (iii) our method successfully identified statistically and biologically significant relationships across networks and organisms. Contact: ggulsoy@cise.ufl.edu; tamer@cise.ufl.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2011-07-01 2011-06-14 /pmc/articles/PMC3117351/ /pubmed/21685064 http://dx.doi.org/10.1093/bioinformatics/btr203 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria
Gülsoy, Günhan
Kahveci, Tamer
RINQ: Reference-based Indexing for Network Queries
title RINQ: Reference-based Indexing for Network Queries
title_full RINQ: Reference-based Indexing for Network Queries
title_fullStr RINQ: Reference-based Indexing for Network Queries
title_full_unstemmed RINQ: Reference-based Indexing for Network Queries
title_short RINQ: Reference-based Indexing for Network Queries
title_sort rinq: reference-based indexing for network queries
topic Ismb/Eccb 2011 Proceedings Papers Committee July 17 to July 19, 2011, Vienna, Austria
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117351/
https://www.ncbi.nlm.nih.gov/pubmed/21685064
http://dx.doi.org/10.1093/bioinformatics/btr203
work_keys_str_mv AT gulsoygunhan rinqreferencebasedindexingfornetworkqueries
AT kahvecitamer rinqreferencebasedindexingfornetworkqueries