Cargando…

Learning to rank diversified results for biomedical information retrieval from multiple features

BACKGROUND: Different from traditional information retrieval (IR), promoting diversity in IR takes consideration of relationship between documents in order to promote novelty and reduce redundancy thus to provide diversified results to satisfy various user intents. Diversity IR in biomedical domain...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Jiajin, Huang, Jimmy Xiangji, Ye, Zheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4304246/
https://www.ncbi.nlm.nih.gov/pubmed/25560088
http://dx.doi.org/10.1186/1475-925X-13-S2-S3
_version_ 1782354064395206656
author Wu, Jiajin
Huang, Jimmy Xiangji
Ye, Zheng
author_facet Wu, Jiajin
Huang, Jimmy Xiangji
Ye, Zheng
author_sort Wu, Jiajin
collection PubMed
description BACKGROUND: Different from traditional information retrieval (IR), promoting diversity in IR takes consideration of relationship between documents in order to promote novelty and reduce redundancy thus to provide diversified results to satisfy various user intents. Diversity IR in biomedical domain is especially important as biologists sometimes want diversified results pertinent to their query. METHODS: A combined learning-to-rank (LTR) framework is learned through a general ranking model (gLTR) and a diversity-biased model. The former is learned from general ranking features by a conventional learning-to-rank approach; the latter is constructed with diversity-indicating features added, which are extracted based on the retrieved passages' topics detected using Wikipedia and ranking order produced by the general learning-to-rank model; final ranking results are given by combination of both models. RESULTS: Compared with baselines BM25 and DirKL on 2006 and 2007 collections, the gLTR has 0.2292 (+16.23% and +44.1% improvement over BM25 and DirKL respectively) and 0.1873 (+15.78% and +39.0% improvement over BM25 and DirKL respectively) in terms of aspect level of mean average precision (Aspect MAP). The LTR method outperforms gLTR on 2006 and 2007 collections with 4.7% and 2.4% improvement in terms of Aspect MAP. CONCLUSIONS: The learning-to-rank method is an efficient way for biomedical information retrieval and the diversity-biased features are beneficial for promoting diversity in ranking results.
format Online
Article
Text
id pubmed-4304246
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43042462015-02-12 Learning to rank diversified results for biomedical information retrieval from multiple features Wu, Jiajin Huang, Jimmy Xiangji Ye, Zheng Biomed Eng Online Research BACKGROUND: Different from traditional information retrieval (IR), promoting diversity in IR takes consideration of relationship between documents in order to promote novelty and reduce redundancy thus to provide diversified results to satisfy various user intents. Diversity IR in biomedical domain is especially important as biologists sometimes want diversified results pertinent to their query. METHODS: A combined learning-to-rank (LTR) framework is learned through a general ranking model (gLTR) and a diversity-biased model. The former is learned from general ranking features by a conventional learning-to-rank approach; the latter is constructed with diversity-indicating features added, which are extracted based on the retrieved passages' topics detected using Wikipedia and ranking order produced by the general learning-to-rank model; final ranking results are given by combination of both models. RESULTS: Compared with baselines BM25 and DirKL on 2006 and 2007 collections, the gLTR has 0.2292 (+16.23% and +44.1% improvement over BM25 and DirKL respectively) and 0.1873 (+15.78% and +39.0% improvement over BM25 and DirKL respectively) in terms of aspect level of mean average precision (Aspect MAP). The LTR method outperforms gLTR on 2006 and 2007 collections with 4.7% and 2.4% improvement in terms of Aspect MAP. CONCLUSIONS: The learning-to-rank method is an efficient way for biomedical information retrieval and the diversity-biased features are beneficial for promoting diversity in ranking results. BioMed Central 2014-12-11 /pmc/articles/PMC4304246/ /pubmed/25560088 http://dx.doi.org/10.1186/1475-925X-13-S2-S3 Text en Copyright © 2014 Wu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Wu, Jiajin
Huang, Jimmy Xiangji
Ye, Zheng
Learning to rank diversified results for biomedical information retrieval from multiple features
title Learning to rank diversified results for biomedical information retrieval from multiple features
title_full Learning to rank diversified results for biomedical information retrieval from multiple features
title_fullStr Learning to rank diversified results for biomedical information retrieval from multiple features
title_full_unstemmed Learning to rank diversified results for biomedical information retrieval from multiple features
title_short Learning to rank diversified results for biomedical information retrieval from multiple features
title_sort learning to rank diversified results for biomedical information retrieval from multiple features
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4304246/
https://www.ncbi.nlm.nih.gov/pubmed/25560088
http://dx.doi.org/10.1186/1475-925X-13-S2-S3
work_keys_str_mv AT wujiajin learningtorankdiversifiedresultsforbiomedicalinformationretrievalfrommultiplefeatures
AT huangjimmyxiangji learningtorankdiversifiedresultsforbiomedicalinformationretrievalfrommultiplefeatures
AT yezheng learningtorankdiversifiedresultsforbiomedicalinformationretrievalfrommultiplefeatures