Cargando…
Recommending MeSH terms for annotating biomedical articles
BACKGROUND: Due to the high cost of manual curation of key aspects from the scientific literature, automated methods for assisting this process are greatly desired. Here, we report a novel approach to facilitate MeSH indexing, a challenging task of assigning MeSH terms to MEDLINE citations for their...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Group
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168302/ https://www.ncbi.nlm.nih.gov/pubmed/21613640 http://dx.doi.org/10.1136/amiajnl-2010-000055 |
_version_ | 1782211369397911552 |
---|---|
author | Huang, Minlie Névéol, Aurélie Lu, Zhiyong |
author_facet | Huang, Minlie Névéol, Aurélie Lu, Zhiyong |
author_sort | Huang, Minlie |
collection | PubMed |
description | BACKGROUND: Due to the high cost of manual curation of key aspects from the scientific literature, automated methods for assisting this process are greatly desired. Here, we report a novel approach to facilitate MeSH indexing, a challenging task of assigning MeSH terms to MEDLINE citations for their archiving and retrieval. METHODS: Unlike previous methods for automatic MeSH term assignment, we reformulate the indexing task as a ranking problem such that relevant MeSH headings are ranked higher than those irrelevant ones. Specifically, for each document we retrieve 20 neighbor documents, obtain a list of MeSH main headings from neighbors, and rank the MeSH main headings using ListNet–a learning-to-rank algorithm. We trained our algorithm on 200 documents and tested on a previously used benchmark set of 200 documents and a larger dataset of 1000 documents. RESULTS: Tested on the benchmark dataset, our method achieved a precision of 0.390, recall of 0.712, and mean average precision (MAP) of 0.626. In comparison to the state of the art, we observe statistically significant improvements as large as 39% in MAP (p-value <0.001). Similar significant improvements were also obtained on the larger document set. CONCLUSION: Experimental results show that our approach makes the most accurate MeSH predictions to date, which suggests its great potential in making a practical impact on MeSH indexing. Furthermore, as discussed the proposed learning framework is robust and can be adapted to many other similar tasks beyond MeSH indexing in the biomedical domain. All data sets are available at: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/indexing. |
format | Online Article Text |
id | pubmed-3168302 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BMJ Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-31683022011-09-09 Recommending MeSH terms for annotating biomedical articles Huang, Minlie Névéol, Aurélie Lu, Zhiyong J Am Med Inform Assoc Research and Applications BACKGROUND: Due to the high cost of manual curation of key aspects from the scientific literature, automated methods for assisting this process are greatly desired. Here, we report a novel approach to facilitate MeSH indexing, a challenging task of assigning MeSH terms to MEDLINE citations for their archiving and retrieval. METHODS: Unlike previous methods for automatic MeSH term assignment, we reformulate the indexing task as a ranking problem such that relevant MeSH headings are ranked higher than those irrelevant ones. Specifically, for each document we retrieve 20 neighbor documents, obtain a list of MeSH main headings from neighbors, and rank the MeSH main headings using ListNet–a learning-to-rank algorithm. We trained our algorithm on 200 documents and tested on a previously used benchmark set of 200 documents and a larger dataset of 1000 documents. RESULTS: Tested on the benchmark dataset, our method achieved a precision of 0.390, recall of 0.712, and mean average precision (MAP) of 0.626. In comparison to the state of the art, we observe statistically significant improvements as large as 39% in MAP (p-value <0.001). Similar significant improvements were also obtained on the larger document set. CONCLUSION: Experimental results show that our approach makes the most accurate MeSH predictions to date, which suggests its great potential in making a practical impact on MeSH indexing. Furthermore, as discussed the proposed learning framework is robust and can be adapted to many other similar tasks beyond MeSH indexing in the biomedical domain. All data sets are available at: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/indexing. BMJ Group 2011-05-25 2011 /pmc/articles/PMC3168302/ /pubmed/21613640 http://dx.doi.org/10.1136/amiajnl-2010-000055 Text en © 2011, Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions. This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode. |
spellingShingle | Research and Applications Huang, Minlie Névéol, Aurélie Lu, Zhiyong Recommending MeSH terms for annotating biomedical articles |
title | Recommending MeSH terms for annotating biomedical articles |
title_full | Recommending MeSH terms for annotating biomedical articles |
title_fullStr | Recommending MeSH terms for annotating biomedical articles |
title_full_unstemmed | Recommending MeSH terms for annotating biomedical articles |
title_short | Recommending MeSH terms for annotating biomedical articles |
title_sort | recommending mesh terms for annotating biomedical articles |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168302/ https://www.ncbi.nlm.nih.gov/pubmed/21613640 http://dx.doi.org/10.1136/amiajnl-2010-000055 |
work_keys_str_mv | AT huangminlie recommendingmeshtermsforannotatingbiomedicalarticles AT neveolaurelie recommendingmeshtermsforannotatingbiomedicalarticles AT luzhiyong recommendingmeshtermsforannotatingbiomedicalarticles |