Cargando…

Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols

BACKGROUND: The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to suppo...

Descripción completa

Detalles Bibliográficos
Autor principal:	Campagne, Fabien
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2008
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2292696/ https://www.ncbi.nlm.nih.gov/pubmed/18312673 http://dx.doi.org/10.1186/1471-2105-9-132

_version_	1782152511842418688
author	Campagne, Fabien
author_facet	Campagne, Fabien
author_sort	Campagne, Fabien
collection	PubMed
description	BACKGROUND: The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications. RESULTS: We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79–0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86–0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently. CONCLUSION: Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute for human evaluations.
format	Text
id	pubmed-2292696
institution	National Center for Biotechnology Information
language	English
publishDate	2008
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-22926962008-04-14 Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols Campagne, Fabien BMC Bioinformatics Methodology Article BACKGROUND: The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications. RESULTS: We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79–0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86–0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently. CONCLUSION: Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute for human evaluations. BioMed Central 2008-02-29 /pmc/articles/PMC2292696/ /pubmed/18312673 http://dx.doi.org/10.1186/1471-2105-9-132 Text en Copyright © 2008 Campagne; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Campagne, Fabien Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols
title	Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols
title_full	Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols
title_fullStr	Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols
title_full_unstemmed	Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols
title_short	Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols
title_sort	objective and automated protocols for the evaluation of biomedical search engines using no title evaluation protocols
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2292696/ https://www.ncbi.nlm.nih.gov/pubmed/18312673 http://dx.doi.org/10.1186/1471-2105-9-132
work_keys_str_mv	AT campagnefabien objectiveandautomatedprotocolsfortheevaluationofbiomedicalsearchenginesusingnotitleevaluationprotocols

Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols

Ejemplares similares