Cargando…
An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection
OBJECTIVES: The purpose of this study is to validate a method that uses multiple queries to create a set of relevance judgments used to indicate which documents are pertinent to each query when forming a biomedical test collection. METHODS: The aspect query is the major concept of this research; it...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Korean Society of Medical Informatics
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3324757/ https://www.ncbi.nlm.nih.gov/pubmed/22509475 http://dx.doi.org/10.4258/hir.2012.18.1.65 |
_version_ | 1782229351671005184 |
---|---|
author | Ryu, Borim Choi, Jinwook |
author_facet | Ryu, Borim Choi, Jinwook |
author_sort | Ryu, Borim |
collection | PubMed |
description | OBJECTIVES: The purpose of this study is to validate a method that uses multiple queries to create a set of relevance judgments used to indicate which documents are pertinent to each query when forming a biomedical test collection. METHODS: The aspect query is the major concept of this research; it can represent every aspect of the original query with the same informational need. Manually generated aspect queries created by 15 recruited participants where run using the BM25 retrieval model in order to create aspect query based relevance sets (QRELS). In order to demonstrate the feasibility of these QRELSs, The results from a 2004 genomics track run supported by the National Institute of Standards and Technology (NIST) were used to compute the mean average precision (MAP) based on Text Retrieval Conference (TREC) QRELSs and aspect-QRELSs. The rank correlation was calculated using both Kendall's and Spearman's rank correlation methods. RESULTS: We experimentally verified the utility of the aspect query method by combining the top ranked documents retrieved by a number of multiple queries which ranked the order of the information. The retrieval system correlated highly with rankings based on human relevance judgments. CONCLUSIONS: Substantial results were shown with high correlations of up to 0.863 (p < 0.01) between the judgment-free gold standard based on the aspect queries and the human-judged gold standard supported by NIST. The results also demonstrate that the aspect query method can contribute in building test collections used for medical literature retrieval. |
format | Online Article Text |
id | pubmed-3324757 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Korean Society of Medical Informatics |
record_format | MEDLINE/PubMed |
spelling | pubmed-33247572012-04-16 An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection Ryu, Borim Choi, Jinwook Healthc Inform Res Original Article OBJECTIVES: The purpose of this study is to validate a method that uses multiple queries to create a set of relevance judgments used to indicate which documents are pertinent to each query when forming a biomedical test collection. METHODS: The aspect query is the major concept of this research; it can represent every aspect of the original query with the same informational need. Manually generated aspect queries created by 15 recruited participants where run using the BM25 retrieval model in order to create aspect query based relevance sets (QRELS). In order to demonstrate the feasibility of these QRELSs, The results from a 2004 genomics track run supported by the National Institute of Standards and Technology (NIST) were used to compute the mean average precision (MAP) based on Text Retrieval Conference (TREC) QRELSs and aspect-QRELSs. The rank correlation was calculated using both Kendall's and Spearman's rank correlation methods. RESULTS: We experimentally verified the utility of the aspect query method by combining the top ranked documents retrieved by a number of multiple queries which ranked the order of the information. The retrieval system correlated highly with rankings based on human relevance judgments. CONCLUSIONS: Substantial results were shown with high correlations of up to 0.863 (p < 0.01) between the judgment-free gold standard based on the aspect queries and the human-judged gold standard supported by NIST. The results also demonstrate that the aspect query method can contribute in building test collections used for medical literature retrieval. Korean Society of Medical Informatics 2012-03 2012-03-31 /pmc/articles/PMC3324757/ /pubmed/22509475 http://dx.doi.org/10.4258/hir.2012.18.1.65 Text en © 2012 The Korean Society of Medical Informatics http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Ryu, Borim Choi, Jinwook An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection |
title | An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection |
title_full | An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection |
title_fullStr | An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection |
title_full_unstemmed | An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection |
title_short | An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection |
title_sort | evaluation of multiple query representations for the relevance judgments used to build a biomedical test collection |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3324757/ https://www.ncbi.nlm.nih.gov/pubmed/22509475 http://dx.doi.org/10.4258/hir.2012.18.1.65 |
work_keys_str_mv | AT ryuborim anevaluationofmultiplequeryrepresentationsfortherelevancejudgmentsusedtobuildabiomedicaltestcollection AT choijinwook anevaluationofmultiplequeryrepresentationsfortherelevancejudgmentsusedtobuildabiomedicaltestcollection AT ryuborim evaluationofmultiplequeryrepresentationsfortherelevancejudgmentsusedtobuildabiomedicaltestcollection AT choijinwook evaluationofmultiplequeryrepresentationsfortherelevancejudgmentsusedtobuildabiomedicaltestcollection |