Cargando…

An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection

OBJECTIVES: The purpose of this study is to validate a method that uses multiple queries to create a set of relevance judgments used to indicate which documents are pertinent to each query when forming a biomedical test collection. METHODS: The aspect query is the major concept of this research; it...

Descripción completa

Detalles Bibliográficos
Autores principales: Ryu, Borim, Choi, Jinwook
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Korean Society of Medical Informatics 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3324757/
https://www.ncbi.nlm.nih.gov/pubmed/22509475
http://dx.doi.org/10.4258/hir.2012.18.1.65
_version_ 1782229351671005184
author Ryu, Borim
Choi, Jinwook
author_facet Ryu, Borim
Choi, Jinwook
author_sort Ryu, Borim
collection PubMed
description OBJECTIVES: The purpose of this study is to validate a method that uses multiple queries to create a set of relevance judgments used to indicate which documents are pertinent to each query when forming a biomedical test collection. METHODS: The aspect query is the major concept of this research; it can represent every aspect of the original query with the same informational need. Manually generated aspect queries created by 15 recruited participants where run using the BM25 retrieval model in order to create aspect query based relevance sets (QRELS). In order to demonstrate the feasibility of these QRELSs, The results from a 2004 genomics track run supported by the National Institute of Standards and Technology (NIST) were used to compute the mean average precision (MAP) based on Text Retrieval Conference (TREC) QRELSs and aspect-QRELSs. The rank correlation was calculated using both Kendall's and Spearman's rank correlation methods. RESULTS: We experimentally verified the utility of the aspect query method by combining the top ranked documents retrieved by a number of multiple queries which ranked the order of the information. The retrieval system correlated highly with rankings based on human relevance judgments. CONCLUSIONS: Substantial results were shown with high correlations of up to 0.863 (p < 0.01) between the judgment-free gold standard based on the aspect queries and the human-judged gold standard supported by NIST. The results also demonstrate that the aspect query method can contribute in building test collections used for medical literature retrieval.
format Online
Article
Text
id pubmed-3324757
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Korean Society of Medical Informatics
record_format MEDLINE/PubMed
spelling pubmed-33247572012-04-16 An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection Ryu, Borim Choi, Jinwook Healthc Inform Res Original Article OBJECTIVES: The purpose of this study is to validate a method that uses multiple queries to create a set of relevance judgments used to indicate which documents are pertinent to each query when forming a biomedical test collection. METHODS: The aspect query is the major concept of this research; it can represent every aspect of the original query with the same informational need. Manually generated aspect queries created by 15 recruited participants where run using the BM25 retrieval model in order to create aspect query based relevance sets (QRELS). In order to demonstrate the feasibility of these QRELSs, The results from a 2004 genomics track run supported by the National Institute of Standards and Technology (NIST) were used to compute the mean average precision (MAP) based on Text Retrieval Conference (TREC) QRELSs and aspect-QRELSs. The rank correlation was calculated using both Kendall's and Spearman's rank correlation methods. RESULTS: We experimentally verified the utility of the aspect query method by combining the top ranked documents retrieved by a number of multiple queries which ranked the order of the information. The retrieval system correlated highly with rankings based on human relevance judgments. CONCLUSIONS: Substantial results were shown with high correlations of up to 0.863 (p < 0.01) between the judgment-free gold standard based on the aspect queries and the human-judged gold standard supported by NIST. The results also demonstrate that the aspect query method can contribute in building test collections used for medical literature retrieval. Korean Society of Medical Informatics 2012-03 2012-03-31 /pmc/articles/PMC3324757/ /pubmed/22509475 http://dx.doi.org/10.4258/hir.2012.18.1.65 Text en © 2012 The Korean Society of Medical Informatics http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Ryu, Borim
Choi, Jinwook
An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection
title An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection
title_full An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection
title_fullStr An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection
title_full_unstemmed An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection
title_short An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection
title_sort evaluation of multiple query representations for the relevance judgments used to build a biomedical test collection
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3324757/
https://www.ncbi.nlm.nih.gov/pubmed/22509475
http://dx.doi.org/10.4258/hir.2012.18.1.65
work_keys_str_mv AT ryuborim anevaluationofmultiplequeryrepresentationsfortherelevancejudgmentsusedtobuildabiomedicaltestcollection
AT choijinwook anevaluationofmultiplequeryrepresentationsfortherelevancejudgmentsusedtobuildabiomedicaltestcollection
AT ryuborim evaluationofmultiplequeryrepresentationsfortherelevancejudgmentsusedtobuildabiomedicaltestcollection
AT choijinwook evaluationofmultiplequeryrepresentationsfortherelevancejudgmentsusedtobuildabiomedicaltestcollection