Cargando…

A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging

BACKGROUND: Computed tomography (CT) reports record a large volume of valuable information about patients’ conditions and the interpretations of radiology images from radiologists, which can be used for clinical decision-making and further academic study. However, the free-text nature of clinical re...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Huanyao, Hu, Danqing, Duan, Huilong, Li, Shaolei, Wu, Nan, Lu, Xudong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8323233/
https://www.ncbi.nlm.nih.gov/pubmed/34330277
http://dx.doi.org/10.1186/s12911-021-01575-x
_version_ 1783731201267204096
author Zhang, Huanyao
Hu, Danqing
Duan, Huilong
Li, Shaolei
Wu, Nan
Lu, Xudong
author_facet Zhang, Huanyao
Hu, Danqing
Duan, Huilong
Li, Shaolei
Wu, Nan
Lu, Xudong
author_sort Zhang, Huanyao
collection PubMed
description BACKGROUND: Computed tomography (CT) reports record a large volume of valuable information about patients’ conditions and the interpretations of radiology images from radiologists, which can be used for clinical decision-making and further academic study. However, the free-text nature of clinical reports is a critical barrier to use this data more effectively. In this study, we investigate a novel deep learning method to extract entities from Chinese CT reports for lung cancer screening and TNM staging. METHODS: The proposed approach presents a new named entity recognition algorithm, namely the BERT-based-BiLSTM-Transformer network (BERT-BTN) with pre-training, to extract clinical entities for lung cancer screening and staging. Specifically, instead of traditional word embedding methods, BERT is applied to learn the deep semantic representations of characters. Following the long short-term memory layer, a Transformer layer is added to capture the global dependencies between characters. Besides, pre-training technique is employed to alleviate the problem of insufficient labeled data. RESULTS: We verify the effectiveness of the proposed approach on a clinical dataset containing 359 CT reports collected from the Department of Thoracic Surgery II of Peking University Cancer Hospital. The experimental results show that the proposed approach achieves an 85.96% macro-F1 score under exact match scheme, which improves the performance by 1.38%, 1.84%, 3.81%,4.29%,5.12%,5.29% and 8.84% compared to BERT-BTN, BERT-LSTM, BERT-fine-tune, BERT-Transformer, FastText-BTN, FastText-BiLSTM and FastText-Transformer, respectively. CONCLUSIONS: In this study, we developed a novel deep learning method, i.e., BERT-BTN with pre-training, to extract the clinical entities from Chinese CT reports. The experimental results indicate that the proposed approach can efficiently recognize various clinical entities about lung cancer screening and staging, which shows the potential for further clinical decision-making and academic research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-021-01575-x.
format Online
Article
Text
id pubmed-8323233
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-83232332021-07-30 A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging Zhang, Huanyao Hu, Danqing Duan, Huilong Li, Shaolei Wu, Nan Lu, Xudong BMC Med Inform Decis Mak Research BACKGROUND: Computed tomography (CT) reports record a large volume of valuable information about patients’ conditions and the interpretations of radiology images from radiologists, which can be used for clinical decision-making and further academic study. However, the free-text nature of clinical reports is a critical barrier to use this data more effectively. In this study, we investigate a novel deep learning method to extract entities from Chinese CT reports for lung cancer screening and TNM staging. METHODS: The proposed approach presents a new named entity recognition algorithm, namely the BERT-based-BiLSTM-Transformer network (BERT-BTN) with pre-training, to extract clinical entities for lung cancer screening and staging. Specifically, instead of traditional word embedding methods, BERT is applied to learn the deep semantic representations of characters. Following the long short-term memory layer, a Transformer layer is added to capture the global dependencies between characters. Besides, pre-training technique is employed to alleviate the problem of insufficient labeled data. RESULTS: We verify the effectiveness of the proposed approach on a clinical dataset containing 359 CT reports collected from the Department of Thoracic Surgery II of Peking University Cancer Hospital. The experimental results show that the proposed approach achieves an 85.96% macro-F1 score under exact match scheme, which improves the performance by 1.38%, 1.84%, 3.81%,4.29%,5.12%,5.29% and 8.84% compared to BERT-BTN, BERT-LSTM, BERT-fine-tune, BERT-Transformer, FastText-BTN, FastText-BiLSTM and FastText-Transformer, respectively. CONCLUSIONS: In this study, we developed a novel deep learning method, i.e., BERT-BTN with pre-training, to extract the clinical entities from Chinese CT reports. The experimental results indicate that the proposed approach can efficiently recognize various clinical entities about lung cancer screening and staging, which shows the potential for further clinical decision-making and academic research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-021-01575-x. BioMed Central 2021-07-30 /pmc/articles/PMC8323233/ /pubmed/34330277 http://dx.doi.org/10.1186/s12911-021-01575-x Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Zhang, Huanyao
Hu, Danqing
Duan, Huilong
Li, Shaolei
Wu, Nan
Lu, Xudong
A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging
title A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging
title_full A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging
title_fullStr A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging
title_full_unstemmed A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging
title_short A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging
title_sort novel deep learning approach to extract chinese clinical entities for lung cancer screening and staging
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8323233/
https://www.ncbi.nlm.nih.gov/pubmed/34330277
http://dx.doi.org/10.1186/s12911-021-01575-x
work_keys_str_mv AT zhanghuanyao anoveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging
AT hudanqing anoveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging
AT duanhuilong anoveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging
AT lishaolei anoveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging
AT wunan anoveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging
AT luxudong anoveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging
AT zhanghuanyao noveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging
AT hudanqing noveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging
AT duanhuilong noveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging
AT lishaolei noveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging
AT wunan noveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging
AT luxudong noveldeeplearningapproachtoextractchineseclinicalentitiesforlungcancerscreeningandstaging