Cargando…

A subgraph isomorphism algorithm and its application to biochemical data

BACKGROUND: Graphs can represent biological networks at the molecular, protein, or species level. An important query is to find all matches of a pattern graph to a target graph. Accomplishing this is inherently difficult (NP-complete) and the efficiency of heuristic algorithms for the problem may de...

Descripción completa

Detalles Bibliográficos
Autores principales: Bonnici, Vincenzo, Giugno, Rosalba, Pulvirenti, Alfredo, Shasha, Dennis, Ferro, Alfredo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3633016/
https://www.ncbi.nlm.nih.gov/pubmed/23815292
http://dx.doi.org/10.1186/1471-2105-14-S7-S13
_version_ 1782266929602363392
author Bonnici, Vincenzo
Giugno, Rosalba
Pulvirenti, Alfredo
Shasha, Dennis
Ferro, Alfredo
author_facet Bonnici, Vincenzo
Giugno, Rosalba
Pulvirenti, Alfredo
Shasha, Dennis
Ferro, Alfredo
author_sort Bonnici, Vincenzo
collection PubMed
description BACKGROUND: Graphs can represent biological networks at the molecular, protein, or species level. An important query is to find all matches of a pattern graph to a target graph. Accomplishing this is inherently difficult (NP-complete) and the efficiency of heuristic algorithms for the problem may depend upon the input graphs. The common aim of existing algorithms is to eliminate unsuccessful mappings as early as and as inexpensively as possible. RESULTS: We propose a new subgraph isomorphism algorithm which applies a search strategy to significantly reduce the search space without using any complex pruning rules or domain reduction procedures. We compare our method with the most recent and efficient subgraph isomorphism algorithms (VFlib, LAD, and our C++ implementation of FocusSearch which was originally distributed in Modula2) on synthetic, molecules, and interaction networks data. We show a significant reduction in the running time of our approach compared with these other excellent methods and show that our algorithm scales well as memory demands increase. CONCLUSIONS: Subgraph isomorphism algorithms are intensively used by biochemical tools. Our analysis gives a comprehensive comparison of different software approaches to subgraph isomorphism highlighting their weaknesses and strengths. This will help researchers make a rational choice among methods depending on their application. We also distribute an open-source package including our system and our own C++ implementation of FocusSearch together with all the used datasets (http://ferrolab.dmi.unict.it/ri.html). In future work, our findings may be extended to approximate subgraph isomorphism algorithms.
format Online
Article
Text
id pubmed-3633016
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36330162013-04-25 A subgraph isomorphism algorithm and its application to biochemical data Bonnici, Vincenzo Giugno, Rosalba Pulvirenti, Alfredo Shasha, Dennis Ferro, Alfredo BMC Bioinformatics Research BACKGROUND: Graphs can represent biological networks at the molecular, protein, or species level. An important query is to find all matches of a pattern graph to a target graph. Accomplishing this is inherently difficult (NP-complete) and the efficiency of heuristic algorithms for the problem may depend upon the input graphs. The common aim of existing algorithms is to eliminate unsuccessful mappings as early as and as inexpensively as possible. RESULTS: We propose a new subgraph isomorphism algorithm which applies a search strategy to significantly reduce the search space without using any complex pruning rules or domain reduction procedures. We compare our method with the most recent and efficient subgraph isomorphism algorithms (VFlib, LAD, and our C++ implementation of FocusSearch which was originally distributed in Modula2) on synthetic, molecules, and interaction networks data. We show a significant reduction in the running time of our approach compared with these other excellent methods and show that our algorithm scales well as memory demands increase. CONCLUSIONS: Subgraph isomorphism algorithms are intensively used by biochemical tools. Our analysis gives a comprehensive comparison of different software approaches to subgraph isomorphism highlighting their weaknesses and strengths. This will help researchers make a rational choice among methods depending on their application. We also distribute an open-source package including our system and our own C++ implementation of FocusSearch together with all the used datasets (http://ferrolab.dmi.unict.it/ri.html). In future work, our findings may be extended to approximate subgraph isomorphism algorithms. BioMed Central 2013-04-22 /pmc/articles/PMC3633016/ /pubmed/23815292 http://dx.doi.org/10.1186/1471-2105-14-S7-S13 Text en Copyright © 2013 Bonnici et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Bonnici, Vincenzo
Giugno, Rosalba
Pulvirenti, Alfredo
Shasha, Dennis
Ferro, Alfredo
A subgraph isomorphism algorithm and its application to biochemical data
title A subgraph isomorphism algorithm and its application to biochemical data
title_full A subgraph isomorphism algorithm and its application to biochemical data
title_fullStr A subgraph isomorphism algorithm and its application to biochemical data
title_full_unstemmed A subgraph isomorphism algorithm and its application to biochemical data
title_short A subgraph isomorphism algorithm and its application to biochemical data
title_sort subgraph isomorphism algorithm and its application to biochemical data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3633016/
https://www.ncbi.nlm.nih.gov/pubmed/23815292
http://dx.doi.org/10.1186/1471-2105-14-S7-S13
work_keys_str_mv AT bonnicivincenzo asubgraphisomorphismalgorithmanditsapplicationtobiochemicaldata
AT giugnorosalba asubgraphisomorphismalgorithmanditsapplicationtobiochemicaldata
AT pulvirentialfredo asubgraphisomorphismalgorithmanditsapplicationtobiochemicaldata
AT shashadennis asubgraphisomorphismalgorithmanditsapplicationtobiochemicaldata
AT ferroalfredo asubgraphisomorphismalgorithmanditsapplicationtobiochemicaldata
AT bonnicivincenzo subgraphisomorphismalgorithmanditsapplicationtobiochemicaldata
AT giugnorosalba subgraphisomorphismalgorithmanditsapplicationtobiochemicaldata
AT pulvirentialfredo subgraphisomorphismalgorithmanditsapplicationtobiochemicaldata
AT shashadennis subgraphisomorphismalgorithmanditsapplicationtobiochemicaldata
AT ferroalfredo subgraphisomorphismalgorithmanditsapplicationtobiochemicaldata