Cargando…
Making Statistical Data More Easily Accessible on the Web: Results of the StatSearch Case Study
In this paper we present the results of the StatSearch case study that aimed at providing an enhanced access to statistical data available on the Web. In the scope of this case study we developed a prototype of an information access tool combining a query-based search engine with semi-automated navi...
Autores principales: | , , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2005
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/896654 |
_version_ | 1780908615639498752 |
---|---|
author | Rajman, M Vesely, M Boynton, I M Fridlund, B Fyhrlund, A Sundgren, B Lundquist, P Thelander, H Wänerskär, M |
author_facet | Rajman, M Vesely, M Boynton, I M Fridlund, B Fyhrlund, A Sundgren, B Lundquist, P Thelander, H Wänerskär, M |
author_sort | Rajman, M |
collection | CERN |
description | In this paper we present the results of the StatSearch case study that aimed at providing an enhanced access to statistical data available on the Web. In the scope of this case study we developed a prototype of an information access tool combining a query-based search engine with semi-automated navigation techniques exploiting the hierarchical structuring of the available data. This tool enables a better control of the information retrieval, improving the quality and ease of the access to statistical information. The central part of the presented StatSearch tool consists in the design of an algorithm for automated navigation through a tree-like hierarchical document structure. The algorithm relies on the computation of query related relevance score distributions over the available database to identify the most relevant clusters in the data structure. These most relevant clusters are then proposed to the user for navigation, or, alternatively, are the support for the automated navigation process. Several approaches to the automation of the navigation are compared and Natural Language Processing techniques allowing more precise and coherent computation of textual similarities are briefly described. The resulting StatSearch prototype was evaluated by the Swedish Statistical Office (SCB) on a sample of over 5000 English documents accessible through the SCB web site. The evaluation was based on supervised on-site usability testing and aimed at identification of the prototype's potentials and its added value with respect to information access as objectively perceived by users. The evaluation method and obtained results are presented. The case study was carried out in the framework of the NEMIS Network of Excellence in Text Mining and Its Applications in Statistics (IST-2001-37574). |
id | cern-896654 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2005 |
record_format | invenio |
spelling | cern-8966542019-09-30T06:29:59Zhttp://cds.cern.ch/record/896654engRajman, MVesely, MBoynton, I MFridlund, BFyhrlund, ASundgren, BLundquist, PThelander, HWänerskär, MMaking Statistical Data More Easily Accessible on the Web: Results of the StatSearch Case StudyInformation Transfer and ManagementIn this paper we present the results of the StatSearch case study that aimed at providing an enhanced access to statistical data available on the Web. In the scope of this case study we developed a prototype of an information access tool combining a query-based search engine with semi-automated navigation techniques exploiting the hierarchical structuring of the available data. This tool enables a better control of the information retrieval, improving the quality and ease of the access to statistical information. The central part of the presented StatSearch tool consists in the design of an algorithm for automated navigation through a tree-like hierarchical document structure. The algorithm relies on the computation of query related relevance score distributions over the available database to identify the most relevant clusters in the data structure. These most relevant clusters are then proposed to the user for navigation, or, alternatively, are the support for the automated navigation process. Several approaches to the automation of the navigation are compared and Natural Language Processing techniques allowing more precise and coherent computation of textual similarities are briefly described. The resulting StatSearch prototype was evaluated by the Swedish Statistical Office (SCB) on a sample of over 5000 English documents accessible through the SCB web site. The evaluation was based on supervised on-site usability testing and aimed at identification of the prototype's potentials and its added value with respect to information access as objectively perceived by users. The evaluation method and obtained results are presented. The case study was carried out in the framework of the NEMIS Network of Excellence in Text Mining and Its Applications in Statistics (IST-2001-37574).CERN-OPEN-2005-022oai:cds.cern.ch:8966542005 |
spellingShingle | Information Transfer and Management Rajman, M Vesely, M Boynton, I M Fridlund, B Fyhrlund, A Sundgren, B Lundquist, P Thelander, H Wänerskär, M Making Statistical Data More Easily Accessible on the Web: Results of the StatSearch Case Study |
title | Making Statistical Data More Easily Accessible on the Web: Results of the StatSearch Case Study |
title_full | Making Statistical Data More Easily Accessible on the Web: Results of the StatSearch Case Study |
title_fullStr | Making Statistical Data More Easily Accessible on the Web: Results of the StatSearch Case Study |
title_full_unstemmed | Making Statistical Data More Easily Accessible on the Web: Results of the StatSearch Case Study |
title_short | Making Statistical Data More Easily Accessible on the Web: Results of the StatSearch Case Study |
title_sort | making statistical data more easily accessible on the web: results of the statsearch case study |
topic | Information Transfer and Management |
url | http://cds.cern.ch/record/896654 |
work_keys_str_mv | AT rajmanm makingstatisticaldatamoreeasilyaccessibleonthewebresultsofthestatsearchcasestudy AT veselym makingstatisticaldatamoreeasilyaccessibleonthewebresultsofthestatsearchcasestudy AT boyntonim makingstatisticaldatamoreeasilyaccessibleonthewebresultsofthestatsearchcasestudy AT fridlundb makingstatisticaldatamoreeasilyaccessibleonthewebresultsofthestatsearchcasestudy AT fyhrlunda makingstatisticaldatamoreeasilyaccessibleonthewebresultsofthestatsearchcasestudy AT sundgrenb makingstatisticaldatamoreeasilyaccessibleonthewebresultsofthestatsearchcasestudy AT lundquistp makingstatisticaldatamoreeasilyaccessibleonthewebresultsofthestatsearchcasestudy AT thelanderh makingstatisticaldatamoreeasilyaccessibleonthewebresultsofthestatsearchcasestudy AT wanerskarm makingstatisticaldatamoreeasilyaccessibleonthewebresultsofthestatsearchcasestudy |