Cargando…
Family member information extraction via neural sequence labeling models with different tag schemes
BACKGROUND: Family history information (FHI) described in unstructured electronic health records (EHRs) is a valuable information source for patient care and scientific researches. Since FHI is usually described in the format of free text, the entire process of FHI extraction consists of various ste...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933890/ https://www.ncbi.nlm.nih.gov/pubmed/31881965 http://dx.doi.org/10.1186/s12911-019-0996-4 |
_version_ | 1783483297239662592 |
---|---|
author | Dai, Hong-Jie |
author_facet | Dai, Hong-Jie |
author_sort | Dai, Hong-Jie |
collection | PubMed |
description | BACKGROUND: Family history information (FHI) described in unstructured electronic health records (EHRs) is a valuable information source for patient care and scientific researches. Since FHI is usually described in the format of free text, the entire process of FHI extraction consists of various steps including section segmentation, family member and clinical observation extraction, and relation discovery between the extracted members and their observations. The extraction step involves the recognition of FHI concepts along with their properties such as the family side attribute of the family member concept. METHODS: This study focuses on the extraction step and formulates it as a sequence labeling problem. We employed a neural sequence labeling model along with different tag schemes to distinguish family members and their observations. Corresponding to different tag schemes, the identified entities were aggregated and processed by different algorithms to determine the required properties. RESULTS: We studied the effectiveness of encoding required properties in the tag schemes by evaluating their performance on the dataset released by the BioCreative/OHNLP challenge 2018. It was observed that the proposed side scheme along with the developed features and neural network architecture can achieve an overall F1-score of 0.849 on the test set, which ranked second in the FHI entity recognition subtask. CONCLUSIONS: By comparing with the performance of conditional random fields models, the developed neural network-based models performed significantly better. However, our error analysis revealed two challenging issues of the current approach. One is that some properties required cross-sentence inferences. The other is that the current model is not able to distinguish between the narratives describing the family members of the patient and those specifying the relatives of the patient’s family members. |
format | Online Article Text |
id | pubmed-6933890 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-69338902019-12-30 Family member information extraction via neural sequence labeling models with different tag schemes Dai, Hong-Jie BMC Med Inform Decis Mak Research BACKGROUND: Family history information (FHI) described in unstructured electronic health records (EHRs) is a valuable information source for patient care and scientific researches. Since FHI is usually described in the format of free text, the entire process of FHI extraction consists of various steps including section segmentation, family member and clinical observation extraction, and relation discovery between the extracted members and their observations. The extraction step involves the recognition of FHI concepts along with their properties such as the family side attribute of the family member concept. METHODS: This study focuses on the extraction step and formulates it as a sequence labeling problem. We employed a neural sequence labeling model along with different tag schemes to distinguish family members and their observations. Corresponding to different tag schemes, the identified entities were aggregated and processed by different algorithms to determine the required properties. RESULTS: We studied the effectiveness of encoding required properties in the tag schemes by evaluating their performance on the dataset released by the BioCreative/OHNLP challenge 2018. It was observed that the proposed side scheme along with the developed features and neural network architecture can achieve an overall F1-score of 0.849 on the test set, which ranked second in the FHI entity recognition subtask. CONCLUSIONS: By comparing with the performance of conditional random fields models, the developed neural network-based models performed significantly better. However, our error analysis revealed two challenging issues of the current approach. One is that some properties required cross-sentence inferences. The other is that the current model is not able to distinguish between the narratives describing the family members of the patient and those specifying the relatives of the patient’s family members. BioMed Central 2019-12-27 /pmc/articles/PMC6933890/ /pubmed/31881965 http://dx.doi.org/10.1186/s12911-019-0996-4 Text en © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Dai, Hong-Jie Family member information extraction via neural sequence labeling models with different tag schemes |
title | Family member information extraction via neural sequence labeling models with different tag schemes |
title_full | Family member information extraction via neural sequence labeling models with different tag schemes |
title_fullStr | Family member information extraction via neural sequence labeling models with different tag schemes |
title_full_unstemmed | Family member information extraction via neural sequence labeling models with different tag schemes |
title_short | Family member information extraction via neural sequence labeling models with different tag schemes |
title_sort | family member information extraction via neural sequence labeling models with different tag schemes |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933890/ https://www.ncbi.nlm.nih.gov/pubmed/31881965 http://dx.doi.org/10.1186/s12911-019-0996-4 |
work_keys_str_mv | AT daihongjie familymemberinformationextractionvianeuralsequencelabelingmodelswithdifferenttagschemes |