Cargando…

DeepMeSH: deep semantic representation for improving large-scale MeSH indexing

Motivation: Medical Subject Headings (MeSH) indexing, which is to assign a set of MeSH main headings to citations, is crucial for many important tasks in biomedical text mining and information retrieval. Large-scale MeSH indexing has two challenging aspects: the citation side and MeSH side. For the...

Descripción completa

Detalles Bibliográficos
Autores principales: Peng, Shengwen, You, Ronghui, Wang, Hongning, Zhai, Chengxiang, Mamitsuka, Hiroshi, Zhu, Shanfeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4908368/
https://www.ncbi.nlm.nih.gov/pubmed/27307646
http://dx.doi.org/10.1093/bioinformatics/btw294
_version_ 1782437669598396416
author Peng, Shengwen
You, Ronghui
Wang, Hongning
Zhai, Chengxiang
Mamitsuka, Hiroshi
Zhu, Shanfeng
author_facet Peng, Shengwen
You, Ronghui
Wang, Hongning
Zhai, Chengxiang
Mamitsuka, Hiroshi
Zhu, Shanfeng
author_sort Peng, Shengwen
collection PubMed
description Motivation: Medical Subject Headings (MeSH) indexing, which is to assign a set of MeSH main headings to citations, is crucial for many important tasks in biomedical text mining and information retrieval. Large-scale MeSH indexing has two challenging aspects: the citation side and MeSH side. For the citation side, all existing methods, including Medical Text Indexer (MTI) by National Library of Medicine and the state-of-the-art method, MeSHLabeler, deal with text by bag-of-words, which cannot capture semantic and context-dependent information well. Methods: We propose DeepMeSH that incorporates deep semantic information for large-scale MeSH indexing. It addresses the two challenges in both citation and MeSH sides. The citation side challenge is solved by a new deep semantic representation, D2V-TFIDF, which concatenates both sparse and dense semantic representations. The MeSH side challenge is solved by using the ‘learning to rank’ framework of MeSHLabeler, which integrates various types of evidence generated from the new semantic representation. Results: DeepMeSH achieved a Micro F-measure of 0.6323, 2% higher than 0.6218 of MeSHLabeler and 12% higher than 0.5637 of MTI, for BioASQ3 challenge data with 6000 citations. Availability and Implementation: The software is available upon request. Contact: zhusf@fudan.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4908368
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-49083682016-06-17 DeepMeSH: deep semantic representation for improving large-scale MeSH indexing Peng, Shengwen You, Ronghui Wang, Hongning Zhai, Chengxiang Mamitsuka, Hiroshi Zhu, Shanfeng Bioinformatics Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida Motivation: Medical Subject Headings (MeSH) indexing, which is to assign a set of MeSH main headings to citations, is crucial for many important tasks in biomedical text mining and information retrieval. Large-scale MeSH indexing has two challenging aspects: the citation side and MeSH side. For the citation side, all existing methods, including Medical Text Indexer (MTI) by National Library of Medicine and the state-of-the-art method, MeSHLabeler, deal with text by bag-of-words, which cannot capture semantic and context-dependent information well. Methods: We propose DeepMeSH that incorporates deep semantic information for large-scale MeSH indexing. It addresses the two challenges in both citation and MeSH sides. The citation side challenge is solved by a new deep semantic representation, D2V-TFIDF, which concatenates both sparse and dense semantic representations. The MeSH side challenge is solved by using the ‘learning to rank’ framework of MeSHLabeler, which integrates various types of evidence generated from the new semantic representation. Results: DeepMeSH achieved a Micro F-measure of 0.6323, 2% higher than 0.6218 of MeSHLabeler and 12% higher than 0.5637 of MTI, for BioASQ3 challenge data with 6000 citations. Availability and Implementation: The software is available upon request. Contact: zhusf@fudan.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2016-06-15 2016-06-11 /pmc/articles/PMC4908368/ /pubmed/27307646 http://dx.doi.org/10.1093/bioinformatics/btw294 Text en © The Author 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida
Peng, Shengwen
You, Ronghui
Wang, Hongning
Zhai, Chengxiang
Mamitsuka, Hiroshi
Zhu, Shanfeng
DeepMeSH: deep semantic representation for improving large-scale MeSH indexing
title DeepMeSH: deep semantic representation for improving large-scale MeSH indexing
title_full DeepMeSH: deep semantic representation for improving large-scale MeSH indexing
title_fullStr DeepMeSH: deep semantic representation for improving large-scale MeSH indexing
title_full_unstemmed DeepMeSH: deep semantic representation for improving large-scale MeSH indexing
title_short DeepMeSH: deep semantic representation for improving large-scale MeSH indexing
title_sort deepmesh: deep semantic representation for improving large-scale mesh indexing
topic Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4908368/
https://www.ncbi.nlm.nih.gov/pubmed/27307646
http://dx.doi.org/10.1093/bioinformatics/btw294
work_keys_str_mv AT pengshengwen deepmeshdeepsemanticrepresentationforimprovinglargescalemeshindexing
AT youronghui deepmeshdeepsemanticrepresentationforimprovinglargescalemeshindexing
AT wanghongning deepmeshdeepsemanticrepresentationforimprovinglargescalemeshindexing
AT zhaichengxiang deepmeshdeepsemanticrepresentationforimprovinglargescalemeshindexing
AT mamitsukahiroshi deepmeshdeepsemanticrepresentationforimprovinglargescalemeshindexing
AT zhushanfeng deepmeshdeepsemanticrepresentationforimprovinglargescalemeshindexing