Cargando…

Relemed: sentence-level search engine with relevance score for the MEDLINE database of biomedical articles

BACKGROUND: Receiving extraneous articles in response to a query submitted to MEDLINE/PubMed is common. When submitting a multi-word query (which is the majority of queries submitted), the presence of all query words within each article may be a necessary condition for retrieving relevant articles,...

Descripción completa

Detalles Bibliográficos
Autores principales: Siadaty, Mir S, Shu, Jianfen, Knaus, William A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1780044/
https://www.ncbi.nlm.nih.gov/pubmed/17214888
http://dx.doi.org/10.1186/1472-6947-7-1
_version_ 1782131840527630336
author Siadaty, Mir S
Shu, Jianfen
Knaus, William A
author_facet Siadaty, Mir S
Shu, Jianfen
Knaus, William A
author_sort Siadaty, Mir S
collection PubMed
description BACKGROUND: Receiving extraneous articles in response to a query submitted to MEDLINE/PubMed is common. When submitting a multi-word query (which is the majority of queries submitted), the presence of all query words within each article may be a necessary condition for retrieving relevant articles, but not sufficient. Ideally a relationship between the query words in the article is also required. We propose that if two words occur within an article, the probability that a relation between them is explained is higher when the words occur within adjacent sentences versus remote sentences. Therefore, sentence-level concurrence can be used as a surrogate for existence of the relationship between the words. In order to avoid the irrelevant articles, one solution would be to increase the search specificity. Another solution is to estimate a relevance score to sort the retrieved articles. However among the >30 retrieval services available for MEDLINE, only a few estimate a relevance score, and none detects and incorporates the relation between the query words as part of the relevance score. RESULTS: We have developed "Relemed", a search engine for MEDLINE. Relemed increases specificity and precision of retrieval by searching for query words within sentences rather than the whole article. It uses sentence-level concurrence as a statistical surrogate for the existence of relationship between the words. It also estimates a relevance score and sorts the results on this basis, thus shifting irrelevant articles lower down the list. In two case studies, we demonstrate that the most relevant articles appear at the top of the Relemed results, while this is not necessarily the case with a PubMed search. We have also shown that a Relemed search includes not only all the articles retrieved by PubMed, but potentially additional relevant articles, due to the extended 'automatic term mapping' and text-word searching features implemented in Relemed. CONCLUSION: By using sentence-level matching, Relemed can deliver higher specificity, thus eliminating more false-positive articles. By introducing an appropriate relevance metric, the most relevant articles on which the user wishes to focus are listed first. Relemed also shrinks the displayed text, and hence the time spent scanning the articles.
format Text
id pubmed-1780044
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-17800442007-01-23 Relemed: sentence-level search engine with relevance score for the MEDLINE database of biomedical articles Siadaty, Mir S Shu, Jianfen Knaus, William A BMC Med Inform Decis Mak Software BACKGROUND: Receiving extraneous articles in response to a query submitted to MEDLINE/PubMed is common. When submitting a multi-word query (which is the majority of queries submitted), the presence of all query words within each article may be a necessary condition for retrieving relevant articles, but not sufficient. Ideally a relationship between the query words in the article is also required. We propose that if two words occur within an article, the probability that a relation between them is explained is higher when the words occur within adjacent sentences versus remote sentences. Therefore, sentence-level concurrence can be used as a surrogate for existence of the relationship between the words. In order to avoid the irrelevant articles, one solution would be to increase the search specificity. Another solution is to estimate a relevance score to sort the retrieved articles. However among the >30 retrieval services available for MEDLINE, only a few estimate a relevance score, and none detects and incorporates the relation between the query words as part of the relevance score. RESULTS: We have developed "Relemed", a search engine for MEDLINE. Relemed increases specificity and precision of retrieval by searching for query words within sentences rather than the whole article. It uses sentence-level concurrence as a statistical surrogate for the existence of relationship between the words. It also estimates a relevance score and sorts the results on this basis, thus shifting irrelevant articles lower down the list. In two case studies, we demonstrate that the most relevant articles appear at the top of the Relemed results, while this is not necessarily the case with a PubMed search. We have also shown that a Relemed search includes not only all the articles retrieved by PubMed, but potentially additional relevant articles, due to the extended 'automatic term mapping' and text-word searching features implemented in Relemed. CONCLUSION: By using sentence-level matching, Relemed can deliver higher specificity, thus eliminating more false-positive articles. By introducing an appropriate relevance metric, the most relevant articles on which the user wishes to focus are listed first. Relemed also shrinks the displayed text, and hence the time spent scanning the articles. BioMed Central 2007-01-10 /pmc/articles/PMC1780044/ /pubmed/17214888 http://dx.doi.org/10.1186/1472-6947-7-1 Text en Copyright © 2007 Siadaty et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Siadaty, Mir S
Shu, Jianfen
Knaus, William A
Relemed: sentence-level search engine with relevance score for the MEDLINE database of biomedical articles
title Relemed: sentence-level search engine with relevance score for the MEDLINE database of biomedical articles
title_full Relemed: sentence-level search engine with relevance score for the MEDLINE database of biomedical articles
title_fullStr Relemed: sentence-level search engine with relevance score for the MEDLINE database of biomedical articles
title_full_unstemmed Relemed: sentence-level search engine with relevance score for the MEDLINE database of biomedical articles
title_short Relemed: sentence-level search engine with relevance score for the MEDLINE database of biomedical articles
title_sort relemed: sentence-level search engine with relevance score for the medline database of biomedical articles
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1780044/
https://www.ncbi.nlm.nih.gov/pubmed/17214888
http://dx.doi.org/10.1186/1472-6947-7-1
work_keys_str_mv AT siadatymirs relemedsentencelevelsearchenginewithrelevancescoreforthemedlinedatabaseofbiomedicalarticles
AT shujianfen relemedsentencelevelsearchenginewithrelevancescoreforthemedlinedatabaseofbiomedicalarticles
AT knauswilliama relemedsentencelevelsearchenginewithrelevancescoreforthemedlinedatabaseofbiomedicalarticles