Cargando…

Recognition of medication information from discharge summaries using ensembles of classifiers

BACKGROUND: Extraction of clinical information such as medications or problems from clinical text is an important task of clinical natural language processing (NLP). Rule-based methods are often used in clinical NLP systems because they are easy to adapt and customize. Recently, supervised machine l...

Descripción completa

Detalles Bibliográficos
Autores principales:	Doan, Son, Collier, Nigel, Xu, Hua, Duy, Pham Hoang, Phuong, Tu Minh
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3502425/ https://www.ncbi.nlm.nih.gov/pubmed/22564405 http://dx.doi.org/10.1186/1472-6947-12-36

_version_	1782250335733022720
author	Doan, Son Collier, Nigel Xu, Hua Duy, Pham Hoang Phuong, Tu Minh
author_facet	Doan, Son Collier, Nigel Xu, Hua Duy, Pham Hoang Phuong, Tu Minh
author_sort	Doan, Son
collection	PubMed
description	BACKGROUND: Extraction of clinical information such as medications or problems from clinical text is an important task of clinical natural language processing (NLP). Rule-based methods are often used in clinical NLP systems because they are easy to adapt and customize. Recently, supervised machine learning methods have proven to be effective in clinical NLP as well. However, combining different classifiers to further improve the performance of clinical entity recognition systems has not been investigated extensively. Combining classifiers into an ensemble classifier presents both challenges and opportunities to improve performance in such NLP tasks. METHODS: We investigated ensemble classifiers that used different voting strategies to combine outputs from three individual classifiers: a rule-based system, a support vector machine (SVM) based system, and a conditional random field (CRF) based system. Three voting methods were proposed and evaluated using the annotated data sets from the 2009 i2b2 NLP challenge: simple majority, local SVM-based voting, and local CRF-based voting. RESULTS: Evaluation on 268 manually annotated discharge summaries from the i2b2 challenge showed that the local CRF-based voting method achieved the best F-score of 90.84% (94.11% Precision, 87.81% Recall) for 10-fold cross-validation. We then compared our systems with the first-ranked system in the challenge by using the same training and test sets. Our system based on majority voting achieved a better F-score of 89.65% (93.91% Precision, 85.76% Recall) than the previously reported F-score of 89.19% (93.78% Precision, 85.03% Recall) by the first-ranked system in the challenge. CONCLUSIONS: Our experimental results using the 2009 i2b2 challenge datasets showed that ensemble classifiers that combine individual classifiers into a voting system could achieve better performance than a single classifier in recognizing medication information from clinical text. It suggests that simple strategies that can be easily implemented such as majority voting could have the potential to significantly improve clinical entity recognition.
format	Online Article Text
id	pubmed-3502425
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-35024252012-11-27 Recognition of medication information from discharge summaries using ensembles of classifiers Doan, Son Collier, Nigel Xu, Hua Duy, Pham Hoang Phuong, Tu Minh BMC Med Inform Decis Mak Research Article BACKGROUND: Extraction of clinical information such as medications or problems from clinical text is an important task of clinical natural language processing (NLP). Rule-based methods are often used in clinical NLP systems because they are easy to adapt and customize. Recently, supervised machine learning methods have proven to be effective in clinical NLP as well. However, combining different classifiers to further improve the performance of clinical entity recognition systems has not been investigated extensively. Combining classifiers into an ensemble classifier presents both challenges and opportunities to improve performance in such NLP tasks. METHODS: We investigated ensemble classifiers that used different voting strategies to combine outputs from three individual classifiers: a rule-based system, a support vector machine (SVM) based system, and a conditional random field (CRF) based system. Three voting methods were proposed and evaluated using the annotated data sets from the 2009 i2b2 NLP challenge: simple majority, local SVM-based voting, and local CRF-based voting. RESULTS: Evaluation on 268 manually annotated discharge summaries from the i2b2 challenge showed that the local CRF-based voting method achieved the best F-score of 90.84% (94.11% Precision, 87.81% Recall) for 10-fold cross-validation. We then compared our systems with the first-ranked system in the challenge by using the same training and test sets. Our system based on majority voting achieved a better F-score of 89.65% (93.91% Precision, 85.76% Recall) than the previously reported F-score of 89.19% (93.78% Precision, 85.03% Recall) by the first-ranked system in the challenge. CONCLUSIONS: Our experimental results using the 2009 i2b2 challenge datasets showed that ensemble classifiers that combine individual classifiers into a voting system could achieve better performance than a single classifier in recognizing medication information from clinical text. It suggests that simple strategies that can be easily implemented such as majority voting could have the potential to significantly improve clinical entity recognition. BioMed Central 2012-05-07 /pmc/articles/PMC3502425/ /pubmed/22564405 http://dx.doi.org/10.1186/1472-6947-12-36 Text en Copyright ©2012 Doan et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Doan, Son Collier, Nigel Xu, Hua Duy, Pham Hoang Phuong, Tu Minh Recognition of medication information from discharge summaries using ensembles of classifiers
title	Recognition of medication information from discharge summaries using ensembles of classifiers
title_full	Recognition of medication information from discharge summaries using ensembles of classifiers
title_fullStr	Recognition of medication information from discharge summaries using ensembles of classifiers
title_full_unstemmed	Recognition of medication information from discharge summaries using ensembles of classifiers
title_short	Recognition of medication information from discharge summaries using ensembles of classifiers
title_sort	recognition of medication information from discharge summaries using ensembles of classifiers
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3502425/ https://www.ncbi.nlm.nih.gov/pubmed/22564405 http://dx.doi.org/10.1186/1472-6947-12-36
work_keys_str_mv	AT doanson recognitionofmedicationinformationfromdischargesummariesusingensemblesofclassifiers AT colliernigel recognitionofmedicationinformationfromdischargesummariesusingensemblesofclassifiers AT xuhua recognitionofmedicationinformationfromdischargesummariesusingensemblesofclassifiers AT duyphamhoang recognitionofmedicationinformationfromdischargesummariesusingensemblesofclassifiers AT phuongtuminh recognitionofmedicationinformationfromdischargesummariesusingensemblesofclassifiers

Recognition of medication information from discharge summaries using ensembles of classifiers

Ejemplares similares