Cargando…

Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities

BACKGROUND: Existing biological databases support a variety of queries such as keyword or definition search. However, they do not provide any measure of relevance for the instances reported, and result sets are usually sorted arbitrarily. RESULTS: We describe a system that builds upon the complex in...

Descripción completa

Detalles Bibliográficos
Autores principales: Shafer, Paul, Isganitis, Timothy, Yona, Golan
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1421446/
https://www.ncbi.nlm.nih.gov/pubmed/16480496
http://dx.doi.org/10.1186/1471-2105-7-71
_version_ 1782127181304954880
author Shafer, Paul
Isganitis, Timothy
Yona, Golan
author_facet Shafer, Paul
Isganitis, Timothy
Yona, Golan
author_sort Shafer, Paul
collection PubMed
description BACKGROUND: Existing biological databases support a variety of queries such as keyword or definition search. However, they do not provide any measure of relevance for the instances reported, and result sets are usually sorted arbitrarily. RESULTS: We describe a system that builds upon the complex infrastructure of the Biozon database and applies methods similar to those of Google to rank documents that match queries. We explore different prominence models and study the spectral properties of the corresponding data graphs. We evaluate the information content of principal and non-principal eigenspaces, and test various scoring functions which combine contributions from multiple eigenspaces. We also test the effect of similarity data and other variations which are unique to the biological knowledge domain on the quality of the results. Query result sets are assessed using a probabilistic approach that measures the significance of coherence between directly connected nodes in the data graph. This model allows us, for the first time, to compare different prominence models quantitatively and effectively and to observe unique trends. CONCLUSION: Our tests show that the ranked query results outperform unsorted results with respect to our significance measure and the top ranked entities are typically linked to many other biological entities. Our study resulted in a working ranking system of biological entities that was integrated into Biozon at .
format Text
id pubmed-1421446
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-14214462006-04-21 Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities Shafer, Paul Isganitis, Timothy Yona, Golan BMC Bioinformatics Methodology Article BACKGROUND: Existing biological databases support a variety of queries such as keyword or definition search. However, they do not provide any measure of relevance for the instances reported, and result sets are usually sorted arbitrarily. RESULTS: We describe a system that builds upon the complex infrastructure of the Biozon database and applies methods similar to those of Google to rank documents that match queries. We explore different prominence models and study the spectral properties of the corresponding data graphs. We evaluate the information content of principal and non-principal eigenspaces, and test various scoring functions which combine contributions from multiple eigenspaces. We also test the effect of similarity data and other variations which are unique to the biological knowledge domain on the quality of the results. Query result sets are assessed using a probabilistic approach that measures the significance of coherence between directly connected nodes in the data graph. This model allows us, for the first time, to compare different prominence models quantitatively and effectively and to observe unique trends. CONCLUSION: Our tests show that the ranked query results outperform unsorted results with respect to our significance measure and the top ranked entities are typically linked to many other biological entities. Our study resulted in a working ranking system of biological entities that was integrated into Biozon at . BioMed Central 2006-02-15 /pmc/articles/PMC1421446/ /pubmed/16480496 http://dx.doi.org/10.1186/1471-2105-7-71 Text en Copyright © 2006 Shafer et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Shafer, Paul
Isganitis, Timothy
Yona, Golan
Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities
title Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities
title_full Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities
title_fullStr Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities
title_full_unstemmed Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities
title_short Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities
title_sort hubs of knowledge: using the functional link structure in biozon to mine for biologically significant entities
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1421446/
https://www.ncbi.nlm.nih.gov/pubmed/16480496
http://dx.doi.org/10.1186/1471-2105-7-71
work_keys_str_mv AT shaferpaul hubsofknowledgeusingthefunctionallinkstructureinbiozontomineforbiologicallysignificantentities
AT isganitistimothy hubsofknowledgeusingthefunctionallinkstructureinbiozontomineforbiologicallysignificantentities
AT yonagolan hubsofknowledgeusingthefunctionallinkstructureinbiozontomineforbiologicallysignificantentities