Cargando…

Using Noun Phrases for Navigating Biomedical Literature on Pubmed: How Many Updates Are We Losing Track of?

Author-supplied citations are a fraction of the related literature for a paper. The “related citations” on PubMed is typically dozens or hundreds of results long, and does not offer hints why these results are related. Using noun phrases derived from the sentences of the paper, we show it is possibl...

Descripción completa

Detalles Bibliográficos
Autores principales: Srikrishna, Devabhaktuni, Coram, Marc A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3173492/
https://www.ncbi.nlm.nih.gov/pubmed/21935487
http://dx.doi.org/10.1371/journal.pone.0024920
_version_ 1782211971135832064
author Srikrishna, Devabhaktuni
Coram, Marc A.
author_facet Srikrishna, Devabhaktuni
Coram, Marc A.
author_sort Srikrishna, Devabhaktuni
collection PubMed
description Author-supplied citations are a fraction of the related literature for a paper. The “related citations” on PubMed is typically dozens or hundreds of results long, and does not offer hints why these results are related. Using noun phrases derived from the sentences of the paper, we show it is possible to more transparently navigate to PubMed updates through search terms that can associate a paper with its citations. The algorithm to generate these search terms involved automatically extracting noun phrases from the paper using natural language processing tools, and ranking them by the number of occurrences in the paper compared to the number of occurrences on the web. We define search queries having at least one instance of overlap between the author-supplied citations of the paper and the top 20 search results as citation validated (CV). When the overlapping citations were written by same authors as the paper itself, we define it as CV-S and different authors is defined as CV-D. For a systematic sample of 883 papers on PubMed Central, at least one of the search terms for 86% of the papers is CV-D versus 65% for the top 20 PubMed “related citations.” We hypothesize these quantities computed for the 20 million papers on PubMed to differ within 5% of these percentages. Averaged across all 883 papers, 5 search terms are CV-D, and 10 search terms are CV-S, and 6 unique citations validate these searches. Potentially related literature uncovered by citation-validated searches (either CV-S or CV-D) are on the order of ten per paper – many more if the remaining searches that are not citation-validated are taken into account. The significance and relationship of each search result to the paper can only be vetted and explained by a researcher with knowledge of or interest in that paper.
format Online
Article
Text
id pubmed-3173492
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-31734922011-09-20 Using Noun Phrases for Navigating Biomedical Literature on Pubmed: How Many Updates Are We Losing Track of? Srikrishna, Devabhaktuni Coram, Marc A. PLoS One Research Article Author-supplied citations are a fraction of the related literature for a paper. The “related citations” on PubMed is typically dozens or hundreds of results long, and does not offer hints why these results are related. Using noun phrases derived from the sentences of the paper, we show it is possible to more transparently navigate to PubMed updates through search terms that can associate a paper with its citations. The algorithm to generate these search terms involved automatically extracting noun phrases from the paper using natural language processing tools, and ranking them by the number of occurrences in the paper compared to the number of occurrences on the web. We define search queries having at least one instance of overlap between the author-supplied citations of the paper and the top 20 search results as citation validated (CV). When the overlapping citations were written by same authors as the paper itself, we define it as CV-S and different authors is defined as CV-D. For a systematic sample of 883 papers on PubMed Central, at least one of the search terms for 86% of the papers is CV-D versus 65% for the top 20 PubMed “related citations.” We hypothesize these quantities computed for the 20 million papers on PubMed to differ within 5% of these percentages. Averaged across all 883 papers, 5 search terms are CV-D, and 10 search terms are CV-S, and 6 unique citations validate these searches. Potentially related literature uncovered by citation-validated searches (either CV-S or CV-D) are on the order of ten per paper – many more if the remaining searches that are not citation-validated are taken into account. The significance and relationship of each search result to the paper can only be vetted and explained by a researcher with knowledge of or interest in that paper. Public Library of Science 2011-09-14 /pmc/articles/PMC3173492/ /pubmed/21935487 http://dx.doi.org/10.1371/journal.pone.0024920 Text en Srikrishna, Coram. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Srikrishna, Devabhaktuni
Coram, Marc A.
Using Noun Phrases for Navigating Biomedical Literature on Pubmed: How Many Updates Are We Losing Track of?
title Using Noun Phrases for Navigating Biomedical Literature on Pubmed: How Many Updates Are We Losing Track of?
title_full Using Noun Phrases for Navigating Biomedical Literature on Pubmed: How Many Updates Are We Losing Track of?
title_fullStr Using Noun Phrases for Navigating Biomedical Literature on Pubmed: How Many Updates Are We Losing Track of?
title_full_unstemmed Using Noun Phrases for Navigating Biomedical Literature on Pubmed: How Many Updates Are We Losing Track of?
title_short Using Noun Phrases for Navigating Biomedical Literature on Pubmed: How Many Updates Are We Losing Track of?
title_sort using noun phrases for navigating biomedical literature on pubmed: how many updates are we losing track of?
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3173492/
https://www.ncbi.nlm.nih.gov/pubmed/21935487
http://dx.doi.org/10.1371/journal.pone.0024920
work_keys_str_mv AT srikrishnadevabhaktuni usingnounphrasesfornavigatingbiomedicalliteratureonpubmedhowmanyupdatesarewelosingtrackof
AT corammarca usingnounphrasesfornavigatingbiomedicalliteratureonpubmedhowmanyupdatesarewelosingtrackof