Cargando…

MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence

Motivation: Medical Subject Headings (MeSHs) are used by National Library of Medicine (NLM) to index almost all citations in MEDLINE, which greatly facilitates the applications of biomedical information retrieval and text mining. To reduce the time and financial cost of manual annotation, NLM has de...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Ke, Peng, Shengwen, Wu, Junqiu, Zhai, Chengxiang, Mamitsuka, Hiroshi, Zhu, Shanfeng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2015
Materias:	Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765864/ https://www.ncbi.nlm.nih.gov/pubmed/26072501 http://dx.doi.org/10.1093/bioinformatics/btv237

_version_	1782417585275404288
author	Liu, Ke Peng, Shengwen Wu, Junqiu Zhai, Chengxiang Mamitsuka, Hiroshi Zhu, Shanfeng
author_facet	Liu, Ke Peng, Shengwen Wu, Junqiu Zhai, Chengxiang Mamitsuka, Hiroshi Zhu, Shanfeng
author_sort	Liu, Ke
collection	PubMed
description	Motivation: Medical Subject Headings (MeSHs) are used by National Library of Medicine (NLM) to index almost all citations in MEDLINE, which greatly facilitates the applications of biomedical information retrieval and text mining. To reduce the time and financial cost of manual annotation, NLM has developed a software package, Medical Text Indexer (MTI), for assisting MeSH annotation, which uses k-nearest neighbors (KNN), pattern matching and indexing rules. Other types of information, such as prediction by MeSH classifiers (trained separately), can also be used for automatic MeSH annotation. However, existing methods cannot effectively integrate multiple evidence for MeSH annotation. Methods: We propose a novel framework, MeSHLabeler, to integrate multiple evidence for accurate MeSH annotation by using ‘learning to rank’. Evidence includes numerous predictions from MeSH classifiers, KNN, pattern matching, MTI and the correlation between different MeSH terms, etc. Each MeSH classifier is trained independently, and thus prediction scores from different classifiers are incomparable. To address this issue, we have developed an effective score normalization procedure to improve the prediction accuracy. Results: MeSHLabeler won the first place in Task 2A of 2014 BioASQ challenge, achieving the Micro F-measure of 0.6248 for 9,040 citations provided by the BioASQ challenge. Note that this accuracy is around 9.15% higher than 0.5724, obtained by MTI. Availability and implementation: The software is available upon request. Contact: zhusf@fudan.edu.cn
format	Online Article Text
id	pubmed-4765864
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-47658642016-03-04 MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence Liu, Ke Peng, Shengwen Wu, Junqiu Zhai, Chengxiang Mamitsuka, Hiroshi Zhu, Shanfeng Bioinformatics Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland Motivation: Medical Subject Headings (MeSHs) are used by National Library of Medicine (NLM) to index almost all citations in MEDLINE, which greatly facilitates the applications of biomedical information retrieval and text mining. To reduce the time and financial cost of manual annotation, NLM has developed a software package, Medical Text Indexer (MTI), for assisting MeSH annotation, which uses k-nearest neighbors (KNN), pattern matching and indexing rules. Other types of information, such as prediction by MeSH classifiers (trained separately), can also be used for automatic MeSH annotation. However, existing methods cannot effectively integrate multiple evidence for MeSH annotation. Methods: We propose a novel framework, MeSHLabeler, to integrate multiple evidence for accurate MeSH annotation by using ‘learning to rank’. Evidence includes numerous predictions from MeSH classifiers, KNN, pattern matching, MTI and the correlation between different MeSH terms, etc. Each MeSH classifier is trained independently, and thus prediction scores from different classifiers are incomparable. To address this issue, we have developed an effective score normalization procedure to improve the prediction accuracy. Results: MeSHLabeler won the first place in Task 2A of 2014 BioASQ challenge, achieving the Micro F-measure of 0.6248 for 9,040 citations provided by the BioASQ challenge. Note that this accuracy is around 9.15% higher than 0.5724, obtained by MTI. Availability and implementation: The software is available upon request. Contact: zhusf@fudan.edu.cn Oxford University Press 2015-06-15 2015-06-10 /pmc/articles/PMC4765864/ /pubmed/26072501 http://dx.doi.org/10.1093/bioinformatics/btv237 Text en © The Author 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland Liu, Ke Peng, Shengwen Wu, Junqiu Zhai, Chengxiang Mamitsuka, Hiroshi Zhu, Shanfeng MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence
title	MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence
title_full	MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence
title_fullStr	MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence
title_full_unstemmed	MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence
title_short	MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence
title_sort	meshlabeler: improving the accuracy of large-scale mesh indexing by integrating diverse evidence
topic	Ismb/Eccb 2015 Proceedings Papers Committee July 10 to July 14, 2015, Dublin, Ireland
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765864/ https://www.ncbi.nlm.nih.gov/pubmed/26072501 http://dx.doi.org/10.1093/bioinformatics/btv237
work_keys_str_mv	AT liuke meshlabelerimprovingtheaccuracyoflargescalemeshindexingbyintegratingdiverseevidence AT pengshengwen meshlabelerimprovingtheaccuracyoflargescalemeshindexingbyintegratingdiverseevidence AT wujunqiu meshlabelerimprovingtheaccuracyoflargescalemeshindexingbyintegratingdiverseevidence AT zhaichengxiang meshlabelerimprovingtheaccuracyoflargescalemeshindexingbyintegratingdiverseevidence AT mamitsukahiroshi meshlabelerimprovingtheaccuracyoflargescalemeshindexingbyintegratingdiverseevidence AT zhushanfeng meshlabelerimprovingtheaccuracyoflargescalemeshindexingbyintegratingdiverseevidence

MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence

Ejemplares similares