Cargando…
Finding Related Publications: Extending the Set of Terms Used to Assess Article Similarity
Recommendation of related articles is an important feature of the PubMed. The PubMed Related Citations (PRC) algorithm is the engine that enables this feature, and it leverages information on 22 million citations. We analyzed the performance of the PRC algorithm on 4584 annotated articles from the 2...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Medical Informatics Association
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001748/ https://www.ncbi.nlm.nih.gov/pubmed/27570676 |
_version_ | 1782450475979767808 |
---|---|
author | Wei, Wei Marmor, Rebecca Singh, Siddharth Wang, Shuang Demner-Fushman, Dina Kuo, Tsung-Ting Hsu, Chun-Nan Ohno-Machado, Lucila |
author_facet | Wei, Wei Marmor, Rebecca Singh, Siddharth Wang, Shuang Demner-Fushman, Dina Kuo, Tsung-Ting Hsu, Chun-Nan Ohno-Machado, Lucila |
author_sort | Wei, Wei |
collection | PubMed |
description | Recommendation of related articles is an important feature of the PubMed. The PubMed Related Citations (PRC) algorithm is the engine that enables this feature, and it leverages information on 22 million citations. We analyzed the performance of the PRC algorithm on 4584 annotated articles from the 2005 Text REtrieval Conference (TREC) Genomics Track data. Our analysis indicated that the PRC highest weighted term was not always consistent with the critical term that was most directly related to the topic of the article. We implemented term expansion and found that it was a promising and easy-to-implement approach to improve the performance of the PRC algorithm for the TREC 2005 Genomics data and for the TREC 2014 Clinical Decision Support Track data. For term expansion, we trained a Skip-gram model using the Word2Vec package. This extended PRC algorithm resulted in higher average precision for a large subset of articles. A combination of both algorithms may lead to improved performance in related article recommendations. |
format | Online Article Text |
id | pubmed-5001748 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | American Medical Informatics Association |
record_format | MEDLINE/PubMed |
spelling | pubmed-50017482016-08-26 Finding Related Publications: Extending the Set of Terms Used to Assess Article Similarity Wei, Wei Marmor, Rebecca Singh, Siddharth Wang, Shuang Demner-Fushman, Dina Kuo, Tsung-Ting Hsu, Chun-Nan Ohno-Machado, Lucila AMIA Jt Summits Transl Sci Proc Articles Recommendation of related articles is an important feature of the PubMed. The PubMed Related Citations (PRC) algorithm is the engine that enables this feature, and it leverages information on 22 million citations. We analyzed the performance of the PRC algorithm on 4584 annotated articles from the 2005 Text REtrieval Conference (TREC) Genomics Track data. Our analysis indicated that the PRC highest weighted term was not always consistent with the critical term that was most directly related to the topic of the article. We implemented term expansion and found that it was a promising and easy-to-implement approach to improve the performance of the PRC algorithm for the TREC 2005 Genomics data and for the TREC 2014 Clinical Decision Support Track data. For term expansion, we trained a Skip-gram model using the Word2Vec package. This extended PRC algorithm resulted in higher average precision for a large subset of articles. A combination of both algorithms may lead to improved performance in related article recommendations. American Medical Informatics Association 2016-07-20 /pmc/articles/PMC5001748/ /pubmed/27570676 Text en ©2016 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose |
spellingShingle | Articles Wei, Wei Marmor, Rebecca Singh, Siddharth Wang, Shuang Demner-Fushman, Dina Kuo, Tsung-Ting Hsu, Chun-Nan Ohno-Machado, Lucila Finding Related Publications: Extending the Set of Terms Used to Assess Article Similarity |
title | Finding Related Publications: Extending the Set of Terms Used to Assess Article Similarity |
title_full | Finding Related Publications: Extending the Set of Terms Used to Assess Article Similarity |
title_fullStr | Finding Related Publications: Extending the Set of Terms Used to Assess Article Similarity |
title_full_unstemmed | Finding Related Publications: Extending the Set of Terms Used to Assess Article Similarity |
title_short | Finding Related Publications: Extending the Set of Terms Used to Assess Article Similarity |
title_sort | finding related publications: extending the set of terms used to assess article similarity |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5001748/ https://www.ncbi.nlm.nih.gov/pubmed/27570676 |
work_keys_str_mv | AT weiwei findingrelatedpublicationsextendingthesetoftermsusedtoassessarticlesimilarity AT marmorrebecca findingrelatedpublicationsextendingthesetoftermsusedtoassessarticlesimilarity AT singhsiddharth findingrelatedpublicationsextendingthesetoftermsusedtoassessarticlesimilarity AT wangshuang findingrelatedpublicationsextendingthesetoftermsusedtoassessarticlesimilarity AT demnerfushmandina findingrelatedpublicationsextendingthesetoftermsusedtoassessarticlesimilarity AT kuotsungting findingrelatedpublicationsextendingthesetoftermsusedtoassessarticlesimilarity AT hsuchunnan findingrelatedpublicationsextendingthesetoftermsusedtoassessarticlesimilarity AT ohnomachadolucila findingrelatedpublicationsextendingthesetoftermsusedtoassessarticlesimilarity |