Cargando…

Is searching full text more effective than searching abstracts?

BACKGROUND: With the growing availability of full-text articles online, scientists and other consumers of the life sciences literature now have the ability to go beyond searching bibliographic records (title, abstract, metadata) to directly access full-text content. Motivated by this emerging trend,...

Descripción completa

Detalles Bibliográficos
Autor principal:	Lin, Jimmy
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2009
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2695361/ https://www.ncbi.nlm.nih.gov/pubmed/19192280 http://dx.doi.org/10.1186/1471-2105-10-46

_version_	1782168184875384832
author	Lin, Jimmy
author_facet	Lin, Jimmy
author_sort	Lin, Jimmy
collection	PubMed
description	BACKGROUND: With the growing availability of full-text articles online, scientists and other consumers of the life sciences literature now have the ability to go beyond searching bibliographic records (title, abstract, metadata) to directly access full-text content. Motivated by this emerging trend, I posed the following question: is searching full text more effective than searching abstracts? This question is answered by comparing text retrieval algorithms on MEDLINE(® )abstracts, full-text articles, and spans (paragraphs) within full-text articles using data from the TREC 2007 genomics track evaluation. Two retrieval models are examined: bm25 and the ranking algorithm implemented in the open-source Lucene search engine. RESULTS: Experiments show that treating an entire article as an indexing unit does not consistently yield higher effectiveness compared to abstract-only search. However, retrieval based on spans, or paragraphs-sized segments of full-text articles, consistently outperforms abstract-only search. Results suggest that highest overall effectiveness may be achieved by combining evidence from spans and full articles. CONCLUSION: Users searching full text are more likely to find relevant articles than searching only abstracts. This finding affirms the value of full text collections for text retrieval and provides a starting point for future work in exploring algorithms that take advantage of rapidly-growing digital archives. Experimental results also highlight the need to develop distributed text retrieval algorithms, since full-text articles are significantly longer than abstracts and may require the computational resources of multiple machines in a cluster. The MapReduce programming model provides a convenient framework for organizing such computations.
format	Text
id	pubmed-2695361
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-26953612009-06-12 Is searching full text more effective than searching abstracts? Lin, Jimmy BMC Bioinformatics Research Article BACKGROUND: With the growing availability of full-text articles online, scientists and other consumers of the life sciences literature now have the ability to go beyond searching bibliographic records (title, abstract, metadata) to directly access full-text content. Motivated by this emerging trend, I posed the following question: is searching full text more effective than searching abstracts? This question is answered by comparing text retrieval algorithms on MEDLINE(® )abstracts, full-text articles, and spans (paragraphs) within full-text articles using data from the TREC 2007 genomics track evaluation. Two retrieval models are examined: bm25 and the ranking algorithm implemented in the open-source Lucene search engine. RESULTS: Experiments show that treating an entire article as an indexing unit does not consistently yield higher effectiveness compared to abstract-only search. However, retrieval based on spans, or paragraphs-sized segments of full-text articles, consistently outperforms abstract-only search. Results suggest that highest overall effectiveness may be achieved by combining evidence from spans and full articles. CONCLUSION: Users searching full text are more likely to find relevant articles than searching only abstracts. This finding affirms the value of full text collections for text retrieval and provides a starting point for future work in exploring algorithms that take advantage of rapidly-growing digital archives. Experimental results also highlight the need to develop distributed text retrieval algorithms, since full-text articles are significantly longer than abstracts and may require the computational resources of multiple machines in a cluster. The MapReduce programming model provides a convenient framework for organizing such computations. BioMed Central 2009-02-03 /pmc/articles/PMC2695361/ /pubmed/19192280 http://dx.doi.org/10.1186/1471-2105-10-46 Text en Copyright © 2009 Lin; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Lin, Jimmy Is searching full text more effective than searching abstracts?
title	Is searching full text more effective than searching abstracts?
title_full	Is searching full text more effective than searching abstracts?
title_fullStr	Is searching full text more effective than searching abstracts?
title_full_unstemmed	Is searching full text more effective than searching abstracts?
title_short	Is searching full text more effective than searching abstracts?
title_sort	is searching full text more effective than searching abstracts?
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2695361/ https://www.ncbi.nlm.nih.gov/pubmed/19192280 http://dx.doi.org/10.1186/1471-2105-10-46
work_keys_str_mv	AT linjimmy issearchingfulltextmoreeffectivethansearchingabstracts

Is searching full text more effective than searching abstracts?

Ejemplares similares