Cargando…

Recommending MeSH terms for annotating biomedical articles

BACKGROUND: Due to the high cost of manual curation of key aspects from the scientific literature, automated methods for assisting this process are greatly desired. Here, we report a novel approach to facilitate MeSH indexing, a challenging task of assigning MeSH terms to MEDLINE citations for their...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Minlie, Névéol, Aurélie, Lu, Zhiyong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Group 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168302/
https://www.ncbi.nlm.nih.gov/pubmed/21613640
http://dx.doi.org/10.1136/amiajnl-2010-000055
_version_ 1782211369397911552
author Huang, Minlie
Névéol, Aurélie
Lu, Zhiyong
author_facet Huang, Minlie
Névéol, Aurélie
Lu, Zhiyong
author_sort Huang, Minlie
collection PubMed
description BACKGROUND: Due to the high cost of manual curation of key aspects from the scientific literature, automated methods for assisting this process are greatly desired. Here, we report a novel approach to facilitate MeSH indexing, a challenging task of assigning MeSH terms to MEDLINE citations for their archiving and retrieval. METHODS: Unlike previous methods for automatic MeSH term assignment, we reformulate the indexing task as a ranking problem such that relevant MeSH headings are ranked higher than those irrelevant ones. Specifically, for each document we retrieve 20 neighbor documents, obtain a list of MeSH main headings from neighbors, and rank the MeSH main headings using ListNet–a learning-to-rank algorithm. We trained our algorithm on 200 documents and tested on a previously used benchmark set of 200 documents and a larger dataset of 1000 documents. RESULTS: Tested on the benchmark dataset, our method achieved a precision of 0.390, recall of 0.712, and mean average precision (MAP) of 0.626. In comparison to the state of the art, we observe statistically significant improvements as large as 39% in MAP (p-value <0.001). Similar significant improvements were also obtained on the larger document set. CONCLUSION: Experimental results show that our approach makes the most accurate MeSH predictions to date, which suggests its great potential in making a practical impact on MeSH indexing. Furthermore, as discussed the proposed learning framework is robust and can be adapted to many other similar tasks beyond MeSH indexing in the biomedical domain. All data sets are available at: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/indexing.
format Online
Article
Text
id pubmed-3168302
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BMJ Group
record_format MEDLINE/PubMed
spelling pubmed-31683022011-09-09 Recommending MeSH terms for annotating biomedical articles Huang, Minlie Névéol, Aurélie Lu, Zhiyong J Am Med Inform Assoc Research and Applications BACKGROUND: Due to the high cost of manual curation of key aspects from the scientific literature, automated methods for assisting this process are greatly desired. Here, we report a novel approach to facilitate MeSH indexing, a challenging task of assigning MeSH terms to MEDLINE citations for their archiving and retrieval. METHODS: Unlike previous methods for automatic MeSH term assignment, we reformulate the indexing task as a ranking problem such that relevant MeSH headings are ranked higher than those irrelevant ones. Specifically, for each document we retrieve 20 neighbor documents, obtain a list of MeSH main headings from neighbors, and rank the MeSH main headings using ListNet–a learning-to-rank algorithm. We trained our algorithm on 200 documents and tested on a previously used benchmark set of 200 documents and a larger dataset of 1000 documents. RESULTS: Tested on the benchmark dataset, our method achieved a precision of 0.390, recall of 0.712, and mean average precision (MAP) of 0.626. In comparison to the state of the art, we observe statistically significant improvements as large as 39% in MAP (p-value <0.001). Similar significant improvements were also obtained on the larger document set. CONCLUSION: Experimental results show that our approach makes the most accurate MeSH predictions to date, which suggests its great potential in making a practical impact on MeSH indexing. Furthermore, as discussed the proposed learning framework is robust and can be adapted to many other similar tasks beyond MeSH indexing in the biomedical domain. All data sets are available at: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/indexing. BMJ Group 2011-05-25 2011 /pmc/articles/PMC3168302/ /pubmed/21613640 http://dx.doi.org/10.1136/amiajnl-2010-000055 Text en © 2011, Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions. This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.
spellingShingle Research and Applications
Huang, Minlie
Névéol, Aurélie
Lu, Zhiyong
Recommending MeSH terms for annotating biomedical articles
title Recommending MeSH terms for annotating biomedical articles
title_full Recommending MeSH terms for annotating biomedical articles
title_fullStr Recommending MeSH terms for annotating biomedical articles
title_full_unstemmed Recommending MeSH terms for annotating biomedical articles
title_short Recommending MeSH terms for annotating biomedical articles
title_sort recommending mesh terms for annotating biomedical articles
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168302/
https://www.ncbi.nlm.nih.gov/pubmed/21613640
http://dx.doi.org/10.1136/amiajnl-2010-000055
work_keys_str_mv AT huangminlie recommendingmeshtermsforannotatingbiomedicalarticles
AT neveolaurelie recommendingmeshtermsforannotatingbiomedicalarticles
AT luzhiyong recommendingmeshtermsforannotatingbiomedicalarticles