Cargando…

Parsing clinical text: how good are the state-of-the-art parsers?

BACKGROUND: Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Min, Huang, Yang, Fan, Jung-wei, Tang, Buzhou, Denny, Josh, Xu, Hua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4460747/
https://www.ncbi.nlm.nih.gov/pubmed/26045009
http://dx.doi.org/10.1186/1472-6947-15-S1-S2
_version_ 1782375427807903744
author Jiang, Min
Huang, Yang
Fan, Jung-wei
Tang, Buzhou
Denny, Josh
Xu, Hua
author_facet Jiang, Min
Huang, Yang
Fan, Jung-wei
Tang, Buzhou
Denny, Josh
Xu, Hua
author_sort Jiang, Min
collection PubMed
description BACKGROUND: Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. METHODS: In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. RESULTS: Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measure of 84.15% on the MiPACQ corpus. CONCLUSIONS: Our study demonstrates that re-training using clinical Treebanks is critical for improving general English parsers' performance on clinical text, and combining clinical and open domain corpora might achieve optimal performance for parsing clinical text.
format Online
Article
Text
id pubmed-4460747
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44607472015-06-29 Parsing clinical text: how good are the state-of-the-art parsers? Jiang, Min Huang, Yang Fan, Jung-wei Tang, Buzhou Denny, Josh Xu, Hua BMC Med Inform Decis Mak Research Article BACKGROUND: Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. METHODS: In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. RESULTS: Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measure of 84.15% on the MiPACQ corpus. CONCLUSIONS: Our study demonstrates that re-training using clinical Treebanks is critical for improving general English parsers' performance on clinical text, and combining clinical and open domain corpora might achieve optimal performance for parsing clinical text. BioMed Central 2015-05-20 /pmc/articles/PMC4460747/ /pubmed/26045009 http://dx.doi.org/10.1186/1472-6947-15-S1-S2 Text en Copyright © 2015 Jiang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Jiang, Min
Huang, Yang
Fan, Jung-wei
Tang, Buzhou
Denny, Josh
Xu, Hua
Parsing clinical text: how good are the state-of-the-art parsers?
title Parsing clinical text: how good are the state-of-the-art parsers?
title_full Parsing clinical text: how good are the state-of-the-art parsers?
title_fullStr Parsing clinical text: how good are the state-of-the-art parsers?
title_full_unstemmed Parsing clinical text: how good are the state-of-the-art parsers?
title_short Parsing clinical text: how good are the state-of-the-art parsers?
title_sort parsing clinical text: how good are the state-of-the-art parsers?
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4460747/
https://www.ncbi.nlm.nih.gov/pubmed/26045009
http://dx.doi.org/10.1186/1472-6947-15-S1-S2
work_keys_str_mv AT jiangmin parsingclinicaltexthowgoodarethestateoftheartparsers
AT huangyang parsingclinicaltexthowgoodarethestateoftheartparsers
AT fanjungwei parsingclinicaltexthowgoodarethestateoftheartparsers
AT tangbuzhou parsingclinicaltexthowgoodarethestateoftheartparsers
AT dennyjosh parsingclinicaltexthowgoodarethestateoftheartparsers
AT xuhua parsingclinicaltexthowgoodarethestateoftheartparsers