Cargando…
Evaluation of Term Ranking Algorithms for Pseudo-Relevance Feedback in MEDLINE Retrieval
OBJECTIVES: The purpose of this study was to investigate the effects of query expansion algorithms for MEDLINE retrieval within a pseudo-relevance feedback framework. METHODS: A number of query expansion algorithms were tested using various term ranking formulas, focusing on query expansion based on...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Korean Society of Medical Informatics
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3155169/ https://www.ncbi.nlm.nih.gov/pubmed/21886873 http://dx.doi.org/10.4258/hir.2011.17.2.120 |
_version_ | 1782210089974759424 |
---|---|
author | Yoo, Sooyoung Choi, Jinwook |
author_facet | Yoo, Sooyoung Choi, Jinwook |
author_sort | Yoo, Sooyoung |
collection | PubMed |
description | OBJECTIVES: The purpose of this study was to investigate the effects of query expansion algorithms for MEDLINE retrieval within a pseudo-relevance feedback framework. METHODS: A number of query expansion algorithms were tested using various term ranking formulas, focusing on query expansion based on pseudo-relevance feedback. The OHSUMED test collection, which is a subset of the MEDLINE database, was used as a test corpus. Various ranking algorithms were tested in combination with different term re-weighting algorithms. RESULTS: Our comprehensive evaluation showed that the local context analysis ranking algorithm, when used in combination with one of the reweighting algorithms - Rocchio, the probabilistic model, and our variants - significantly outperformed other algorithm combinations by up to 12% (paired t-test; p < 0.05). In a pseudo-relevance feedback framework, effective query expansion would be achieved by the careful consideration of term ranking and re-weighting algorithm pairs, at least in the context of the OHSUMED corpus. CONCLUSIONS: Comparative experiments on term ranking algorithms were performed in the context of a subset of MEDLINE documents. With medical documents, local context analysis, which uses co-occurrence with all query terms, significantly outperformed various term ranking methods based on both frequency and distribution analyses. Furthermore, the results of the experiments demonstrated that the term rank-based re-weighting method contributed to a remarkable improvement in mean average precision. |
format | Online Article Text |
id | pubmed-3155169 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Korean Society of Medical Informatics |
record_format | MEDLINE/PubMed |
spelling | pubmed-31551692011-08-31 Evaluation of Term Ranking Algorithms for Pseudo-Relevance Feedback in MEDLINE Retrieval Yoo, Sooyoung Choi, Jinwook Healthc Inform Res Original Article OBJECTIVES: The purpose of this study was to investigate the effects of query expansion algorithms for MEDLINE retrieval within a pseudo-relevance feedback framework. METHODS: A number of query expansion algorithms were tested using various term ranking formulas, focusing on query expansion based on pseudo-relevance feedback. The OHSUMED test collection, which is a subset of the MEDLINE database, was used as a test corpus. Various ranking algorithms were tested in combination with different term re-weighting algorithms. RESULTS: Our comprehensive evaluation showed that the local context analysis ranking algorithm, when used in combination with one of the reweighting algorithms - Rocchio, the probabilistic model, and our variants - significantly outperformed other algorithm combinations by up to 12% (paired t-test; p < 0.05). In a pseudo-relevance feedback framework, effective query expansion would be achieved by the careful consideration of term ranking and re-weighting algorithm pairs, at least in the context of the OHSUMED corpus. CONCLUSIONS: Comparative experiments on term ranking algorithms were performed in the context of a subset of MEDLINE documents. With medical documents, local context analysis, which uses co-occurrence with all query terms, significantly outperformed various term ranking methods based on both frequency and distribution analyses. Furthermore, the results of the experiments demonstrated that the term rank-based re-weighting method contributed to a remarkable improvement in mean average precision. Korean Society of Medical Informatics 2011-06 2011-06-30 /pmc/articles/PMC3155169/ /pubmed/21886873 http://dx.doi.org/10.4258/hir.2011.17.2.120 Text en © 2011 The Korean Society of Medical Informatics http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Yoo, Sooyoung Choi, Jinwook Evaluation of Term Ranking Algorithms for Pseudo-Relevance Feedback in MEDLINE Retrieval |
title | Evaluation of Term Ranking Algorithms for Pseudo-Relevance Feedback in MEDLINE Retrieval |
title_full | Evaluation of Term Ranking Algorithms for Pseudo-Relevance Feedback in MEDLINE Retrieval |
title_fullStr | Evaluation of Term Ranking Algorithms for Pseudo-Relevance Feedback in MEDLINE Retrieval |
title_full_unstemmed | Evaluation of Term Ranking Algorithms for Pseudo-Relevance Feedback in MEDLINE Retrieval |
title_short | Evaluation of Term Ranking Algorithms for Pseudo-Relevance Feedback in MEDLINE Retrieval |
title_sort | evaluation of term ranking algorithms for pseudo-relevance feedback in medline retrieval |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3155169/ https://www.ncbi.nlm.nih.gov/pubmed/21886873 http://dx.doi.org/10.4258/hir.2011.17.2.120 |
work_keys_str_mv | AT yoosooyoung evaluationoftermrankingalgorithmsforpseudorelevancefeedbackinmedlineretrieval AT choijinwook evaluationoftermrankingalgorithmsforpseudorelevancefeedbackinmedlineretrieval |