Cargando…

Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields

Electronic health record (EHR) is a digital data format that collects electronic health information about an individual patient or population. To enhance the meaningful use of EHRs, information extraction techniques have been developed to recognize clinical concepts mentioned in EHRs. Nevertheless,...

Descripción completa

Detalles Bibliográficos
Autores principales: Dai, Hong-Jie, Syed-Abdul, Shabbir, Chen, Chih-Wei, Wu, Chieh-Chen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4563061/
https://www.ncbi.nlm.nih.gov/pubmed/26380302
http://dx.doi.org/10.1155/2015/873012
_version_ 1782389234500370432
author Dai, Hong-Jie
Syed-Abdul, Shabbir
Chen, Chih-Wei
Wu, Chieh-Chen
author_facet Dai, Hong-Jie
Syed-Abdul, Shabbir
Chen, Chih-Wei
Wu, Chieh-Chen
author_sort Dai, Hong-Jie
collection PubMed
description Electronic health record (EHR) is a digital data format that collects electronic health information about an individual patient or population. To enhance the meaningful use of EHRs, information extraction techniques have been developed to recognize clinical concepts mentioned in EHRs. Nevertheless, the clinical judgment of an EHR cannot be known solely based on the recognized concepts without considering its contextual information. In order to improve the readability and accessibility of EHRs, this work developed a section heading recognition system for clinical documents. In contrast to formulating the section heading recognition task as a sentence classification problem, this work proposed a token-based formulation with the conditional random field (CRF) model. A standard section heading recognition corpus was compiled by annotators with clinical experience to evaluate the performance and compare it with sentence classification and dictionary-based approaches. The results of the experiments showed that the proposed method achieved a satisfactory F-score of 0.942, which outperformed the sentence-based approach and the best dictionary-based system by 0.087 and 0.096, respectively. One important advantage of our formulation over the sentence-based approach is that it presented an integrated solution without the need to develop additional heuristics rules for isolating the headings from the surrounding section contents.
format Online
Article
Text
id pubmed-4563061
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-45630612015-09-16 Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields Dai, Hong-Jie Syed-Abdul, Shabbir Chen, Chih-Wei Wu, Chieh-Chen Biomed Res Int Research Article Electronic health record (EHR) is a digital data format that collects electronic health information about an individual patient or population. To enhance the meaningful use of EHRs, information extraction techniques have been developed to recognize clinical concepts mentioned in EHRs. Nevertheless, the clinical judgment of an EHR cannot be known solely based on the recognized concepts without considering its contextual information. In order to improve the readability and accessibility of EHRs, this work developed a section heading recognition system for clinical documents. In contrast to formulating the section heading recognition task as a sentence classification problem, this work proposed a token-based formulation with the conditional random field (CRF) model. A standard section heading recognition corpus was compiled by annotators with clinical experience to evaluate the performance and compare it with sentence classification and dictionary-based approaches. The results of the experiments showed that the proposed method achieved a satisfactory F-score of 0.942, which outperformed the sentence-based approach and the best dictionary-based system by 0.087 and 0.096, respectively. One important advantage of our formulation over the sentence-based approach is that it presented an integrated solution without the need to develop additional heuristics rules for isolating the headings from the surrounding section contents. Hindawi Publishing Corporation 2015 2015-08-26 /pmc/articles/PMC4563061/ /pubmed/26380302 http://dx.doi.org/10.1155/2015/873012 Text en Copyright © 2015 Hong-Jie Dai et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Dai, Hong-Jie
Syed-Abdul, Shabbir
Chen, Chih-Wei
Wu, Chieh-Chen
Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields
title Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields
title_full Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields
title_fullStr Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields
title_full_unstemmed Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields
title_short Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields
title_sort recognition and evaluation of clinical section headings in clinical documents using token-based formulation with conditional random fields
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4563061/
https://www.ncbi.nlm.nih.gov/pubmed/26380302
http://dx.doi.org/10.1155/2015/873012
work_keys_str_mv AT daihongjie recognitionandevaluationofclinicalsectionheadingsinclinicaldocumentsusingtokenbasedformulationwithconditionalrandomfields
AT syedabdulshabbir recognitionandevaluationofclinicalsectionheadingsinclinicaldocumentsusingtokenbasedformulationwithconditionalrandomfields
AT chenchihwei recognitionandevaluationofclinicalsectionheadingsinclinicaldocumentsusingtokenbasedformulationwithconditionalrandomfields
AT wuchiehchen recognitionandevaluationofclinicalsectionheadingsinclinicaldocumentsusingtokenbasedformulationwithconditionalrandomfields