Cargando…
A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging
BACKGROUND: Computed tomography (CT) reports record a large volume of valuable information about patients’ conditions and the interpretations of radiology images from radiologists, which can be used for clinical decision-making and further academic study. However, the free-text nature of clinical re...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8323233/ https://www.ncbi.nlm.nih.gov/pubmed/34330277 http://dx.doi.org/10.1186/s12911-021-01575-x |
_version_ | 1783731201267204096 |
---|---|
author | Zhang, Huanyao Hu, Danqing Duan, Huilong Li, Shaolei Wu, Nan Lu, Xudong |
author_facet | Zhang, Huanyao Hu, Danqing Duan, Huilong Li, Shaolei Wu, Nan Lu, Xudong |
author_sort | Zhang, Huanyao |
collection | PubMed |
description | BACKGROUND: Computed tomography (CT) reports record a large volume of valuable information about patients’ conditions and the interpretations of radiology images from radiologists, which can be used for clinical decision-making and further academic study. However, the free-text nature of clinical reports is a critical barrier to use this data more effectively. In this study, we investigate a novel deep learning method to extract entities from Chinese CT reports for lung cancer screening and TNM staging. METHODS: The proposed approach presents a new named entity recognition algorithm, namely the BERT-based-BiLSTM-Transformer network (BERT-BTN) with pre-training, to extract clinical entities for lung cancer screening and staging. Specifically, instead of traditional word embedding methods, BERT is applied to learn the deep semantic representations of characters. Following the long short-term memory layer, a Transformer layer is added to capture the global dependencies between characters. Besides, pre-training technique is employed to alleviate the problem of insufficient labeled data. RESULTS: We verify the effectiveness of the proposed approach on a clinical dataset containing 359 CT reports collected from the Department of Thoracic Surgery II of Peking University Cancer Hospital. The experimental results show that the proposed approach achieves an 85.96% macro-F1 score under exact match scheme, which improves the performance by 1.38%, 1.84%, 3.81%,4.29%,5.12%,5.29% and 8.84% compared to BERT-BTN, BERT-LSTM, BERT-fine-tune, BERT-Transformer, FastText-BTN, FastText-BiLSTM and FastText-Transformer, respectively. CONCLUSIONS: In this study, we developed a novel deep learning method, i.e., BERT-BTN with pre-training, to extract the clinical entities from Chinese CT reports. The experimental results indicate that the proposed approach can efficiently recognize various clinical entities about lung cancer screening and staging, which shows the potential for further clinical decision-making and academic research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-021-01575-x. |
format | Online Article Text |
id | pubmed-8323233 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-83232332021-07-30 A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging Zhang, Huanyao Hu, Danqing Duan, Huilong Li, Shaolei Wu, Nan Lu, Xudong BMC Med Inform Decis Mak Research BACKGROUND: Computed tomography (CT) reports record a large volume of valuable information about patients’ conditions and the interpretations of radiology images from radiologists, which can be used for clinical decision-making and further academic study. However, the free-text nature of clinical reports is a critical barrier to use this data more effectively. In this study, we investigate a novel deep learning method to extract entities from Chinese CT reports for lung cancer screening and TNM staging. METHODS: The proposed approach presents a new named entity recognition algorithm, namely the BERT-based-BiLSTM-Transformer network (BERT-BTN) with pre-training, to extract clinical entities for lung cancer screening and staging. Specifically, instead of traditional word embedding methods, BERT is applied to learn the deep semantic representations of characters. Following the long short-term memory layer, a Transformer layer is added to capture the global dependencies between characters. Besides, pre-training technique is employed to alleviate the problem of insufficient labeled data. RESULTS: We verify the effectiveness of the proposed approach on a clinical dataset containing 359 CT reports collected from the Department of Thoracic Surgery II of Peking University Cancer Hospital. The experimental results show that the proposed approach achieves an 85.96% macro-F1 score under exact match scheme, which improves the performance by 1.38%, 1.84%, 3.81%,4.29%,5.12%,5.29% and 8.84% compared to BERT-BTN, BERT-LSTM, BERT-fine-tune, BERT-Transformer, FastText-BTN, FastText-BiLSTM and FastText-Transformer, respectively. CONCLUSIONS: In this study, we developed a novel deep learning method, i.e., BERT-BTN with pre-training, to extract the clinical entities from Chinese CT reports. The experimental results indicate that the proposed approach can efficiently recognize various clinical entities about lung cancer screening and staging, which shows the potential for further clinical decision-making and academic research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-021-01575-x. BioMed Central 2021-07-30 /pmc/articles/PMC8323233/ /pubmed/34330277 http://dx.doi.org/10.1186/s12911-021-01575-x Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Zhang, Huanyao Hu, Danqing Duan, Huilong Li, Shaolei Wu, Nan Lu, Xudong A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging |
title | A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging |
title_full | A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging |
title_fullStr | A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging |
title_full_unstemmed | A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging |
title_short | A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging |
title_sort | novel deep learning approach to extract chinese clinical entities for lung cancer screening and staging |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8323233/ https://www.ncbi.nlm.nih.gov/pubmed/34330277 http://dx.doi.org/10.1186/s12911-021-01575-x |
work_keys_str_mv | AT zhanghuanyao anoveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging AT hudanqing anoveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging AT duanhuilong anoveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging AT lishaolei anoveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging AT wunan anoveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging AT luxudong anoveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging AT zhanghuanyao noveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging AT hudanqing noveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging AT duanhuilong noveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging AT lishaolei noveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging AT wunan noveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging AT luxudong noveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging |