Cargando…
Family History Information Extraction With Neural Attention and an Enhanced Relation-Side Scheme: Algorithm Development and Validation
BACKGROUND: Identifying and extracting family history information (FHI) from clinical reports are significant for recognizing disease susceptibility. However, FHI is usually described in a narrative manner within patients’ electronic health records, which requires the application of natural language...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7738250/ https://www.ncbi.nlm.nih.gov/pubmed/33258777 http://dx.doi.org/10.2196/21750 |
_version_ | 1783623091628277760 |
---|---|
author | Dai, Hong-Jie Lee, You-Qian Nekkantti, Chandini Jonnagaddala, Jitendra |
author_facet | Dai, Hong-Jie Lee, You-Qian Nekkantti, Chandini Jonnagaddala, Jitendra |
author_sort | Dai, Hong-Jie |
collection | PubMed |
description | BACKGROUND: Identifying and extracting family history information (FHI) from clinical reports are significant for recognizing disease susceptibility. However, FHI is usually described in a narrative manner within patients’ electronic health records, which requires the application of natural language processing technologies to automatically extract such information to provide more comprehensive patient-centered information to physicians. OBJECTIVE: This study aimed to overcome the 2 main challenges observed in previous research focusing on FHI extraction. One is the requirement to develop postprocessing rules to infer the member and side information of family mentions. The other is to efficiently utilize intrasentence and intersentence information to assist FHI extraction. METHODS: We formulated the task as a sequential labeling problem and propose an enhanced relation-side scheme that encodes the required family member properties to not only eliminate the need for postprocessing rules but also relieve the insufficient training instance issues. Moreover, an attention-based neural network structure was proposed to exploit cross-sentence information to identify FHI and its attributes requiring cross-sentence inference. RESULTS: The dataset released by the 2019 n2c2/OHNLP family history extraction task was used to evaluate the performance of the proposed methods. We started by comparing the performance of the traditional neural sequence models with the ordinary scheme and enhanced scheme. Next, we studied the effectiveness of the proposed attention-enhanced neural networks by comparing their performance with that of the traditional networks. It was observed that, with the enhanced scheme, the recall of the neural network can be improved, leading to an increase in the F score of 0.024. The proposed neural attention mechanism enhanced both the recall and precision and resulted in an improved F score of 0.807, which was ranked fourth in the shared task. CONCLUSIONS: We presented an attention-based neural network along with an enhanced tag scheme that enables the neural network model to learn and interpret the implicit relationship and side information of the recognized family members across sentences without relying on heuristic rules. |
format | Online Article Text |
id | pubmed-7738250 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-77382502020-12-18 Family History Information Extraction With Neural Attention and an Enhanced Relation-Side Scheme: Algorithm Development and Validation Dai, Hong-Jie Lee, You-Qian Nekkantti, Chandini Jonnagaddala, Jitendra JMIR Med Inform Original Paper BACKGROUND: Identifying and extracting family history information (FHI) from clinical reports are significant for recognizing disease susceptibility. However, FHI is usually described in a narrative manner within patients’ electronic health records, which requires the application of natural language processing technologies to automatically extract such information to provide more comprehensive patient-centered information to physicians. OBJECTIVE: This study aimed to overcome the 2 main challenges observed in previous research focusing on FHI extraction. One is the requirement to develop postprocessing rules to infer the member and side information of family mentions. The other is to efficiently utilize intrasentence and intersentence information to assist FHI extraction. METHODS: We formulated the task as a sequential labeling problem and propose an enhanced relation-side scheme that encodes the required family member properties to not only eliminate the need for postprocessing rules but also relieve the insufficient training instance issues. Moreover, an attention-based neural network structure was proposed to exploit cross-sentence information to identify FHI and its attributes requiring cross-sentence inference. RESULTS: The dataset released by the 2019 n2c2/OHNLP family history extraction task was used to evaluate the performance of the proposed methods. We started by comparing the performance of the traditional neural sequence models with the ordinary scheme and enhanced scheme. Next, we studied the effectiveness of the proposed attention-enhanced neural networks by comparing their performance with that of the traditional networks. It was observed that, with the enhanced scheme, the recall of the neural network can be improved, leading to an increase in the F score of 0.024. The proposed neural attention mechanism enhanced both the recall and precision and resulted in an improved F score of 0.807, which was ranked fourth in the shared task. CONCLUSIONS: We presented an attention-based neural network along with an enhanced tag scheme that enables the neural network model to learn and interpret the implicit relationship and side information of the recognized family members across sentences without relying on heuristic rules. JMIR Publications 2020-12-01 /pmc/articles/PMC7738250/ /pubmed/33258777 http://dx.doi.org/10.2196/21750 Text en ©Hong-Jie Dai, You-Qian Lee, Chandini Nekkantti, Jitendra Jonnagaddala. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 01.12.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Dai, Hong-Jie Lee, You-Qian Nekkantti, Chandini Jonnagaddala, Jitendra Family History Information Extraction With Neural Attention and an Enhanced Relation-Side Scheme: Algorithm Development and Validation |
title | Family History Information Extraction With Neural Attention and an Enhanced Relation-Side Scheme: Algorithm Development and Validation |
title_full | Family History Information Extraction With Neural Attention and an Enhanced Relation-Side Scheme: Algorithm Development and Validation |
title_fullStr | Family History Information Extraction With Neural Attention and an Enhanced Relation-Side Scheme: Algorithm Development and Validation |
title_full_unstemmed | Family History Information Extraction With Neural Attention and an Enhanced Relation-Side Scheme: Algorithm Development and Validation |
title_short | Family History Information Extraction With Neural Attention and an Enhanced Relation-Side Scheme: Algorithm Development and Validation |
title_sort | family history information extraction with neural attention and an enhanced relation-side scheme: algorithm development and validation |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7738250/ https://www.ncbi.nlm.nih.gov/pubmed/33258777 http://dx.doi.org/10.2196/21750 |
work_keys_str_mv | AT daihongjie familyhistoryinformationextractionwithneuralattentionandanenhancedrelationsideschemealgorithmdevelopmentandvalidation AT leeyouqian familyhistoryinformationextractionwithneuralattentionandanenhancedrelationsideschemealgorithmdevelopmentandvalidation AT nekkanttichandini familyhistoryinformationextractionwithneuralattentionandanenhancedrelationsideschemealgorithmdevelopmentandvalidation AT jonnagaddalajitendra familyhistoryinformationextractionwithneuralattentionandanenhancedrelationsideschemealgorithmdevelopmentandvalidation |