Cargando…

Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma

Rising incidence and mortality of cancer have led to an incremental amount of research in the field. To learn from preexisting data, it has become important to capture maximum information related to disease type, stage, treatment, and outcomes. Medical imaging reports are rich in this kind of inform...

Descripción completa

Detalles Bibliográficos
Autores principales: Mithun, Sneha, Jha, Ashish Kumar, Sherkhane, Umesh B., Jaiswar, Vinay, Purandare, Nilendu C., Dekker, Andre, Puts, Sander, Bermejo, Inigo, Rangarajan, V., Zegers, Catharina M. L., Wee, Leonard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10287609/
https://www.ncbi.nlm.nih.gov/pubmed/36788196
http://dx.doi.org/10.1007/s10278-023-00787-z
_version_ 1785061909668560896
author Mithun, Sneha
Jha, Ashish Kumar
Sherkhane, Umesh B.
Jaiswar, Vinay
Purandare, Nilendu C.
Dekker, Andre
Puts, Sander
Bermejo, Inigo
Rangarajan, V.
Zegers, Catharina M. L.
Wee, Leonard
author_facet Mithun, Sneha
Jha, Ashish Kumar
Sherkhane, Umesh B.
Jaiswar, Vinay
Purandare, Nilendu C.
Dekker, Andre
Puts, Sander
Bermejo, Inigo
Rangarajan, V.
Zegers, Catharina M. L.
Wee, Leonard
author_sort Mithun, Sneha
collection PubMed
description Rising incidence and mortality of cancer have led to an incremental amount of research in the field. To learn from preexisting data, it has become important to capture maximum information related to disease type, stage, treatment, and outcomes. Medical imaging reports are rich in this kind of information but are only present as free text. The extraction of information from such unstructured text reports is labor-intensive. The use of Natural Language Processing (NLP) tools to extract information from radiology reports can make it less time-consuming as well as more effective. In this study, we have developed and compared different models for the classification of lung carcinoma reports using clinical concepts. This study was approved by the institutional ethics committee as a retrospective study with a waiver of informed consent. A clinical concept-based classification pipeline for lung carcinoma radiology reports was developed using rule-based as well as machine learning models and compared. The machine learning models used were XGBoost and two more deep learning model architectures with bidirectional long short-term neural networks. A corpus consisting of 1700 radiology reports including computed tomography (CT) and positron emission tomography/computed tomography (PET/CT) reports were used for development and testing. Five hundred one radiology reports from MIMIC-III Clinical Database version 1.4 was used for external validation. The pipeline achieved an overall F1 score of 0.94 on the internal set and 0.74 on external validation with the rule-based algorithm using expert input giving the best performance. Among the machine learning models, the Bi-LSTM_dropout model performed better than the ML model using XGBoost and the Bi-LSTM_simple model on internal set, whereas on external validation, the Bi-LSTM_simple model performed relatively better than other 2. This pipeline can be used for clinical concept-based classification of radiology reports related to lung carcinoma from a huge corpus and also for automated annotation of these reports. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10278-023-00787-z.
format Online
Article
Text
id pubmed-10287609
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-102876092023-06-24 Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma Mithun, Sneha Jha, Ashish Kumar Sherkhane, Umesh B. Jaiswar, Vinay Purandare, Nilendu C. Dekker, Andre Puts, Sander Bermejo, Inigo Rangarajan, V. Zegers, Catharina M. L. Wee, Leonard J Digit Imaging Article Rising incidence and mortality of cancer have led to an incremental amount of research in the field. To learn from preexisting data, it has become important to capture maximum information related to disease type, stage, treatment, and outcomes. Medical imaging reports are rich in this kind of information but are only present as free text. The extraction of information from such unstructured text reports is labor-intensive. The use of Natural Language Processing (NLP) tools to extract information from radiology reports can make it less time-consuming as well as more effective. In this study, we have developed and compared different models for the classification of lung carcinoma reports using clinical concepts. This study was approved by the institutional ethics committee as a retrospective study with a waiver of informed consent. A clinical concept-based classification pipeline for lung carcinoma radiology reports was developed using rule-based as well as machine learning models and compared. The machine learning models used were XGBoost and two more deep learning model architectures with bidirectional long short-term neural networks. A corpus consisting of 1700 radiology reports including computed tomography (CT) and positron emission tomography/computed tomography (PET/CT) reports were used for development and testing. Five hundred one radiology reports from MIMIC-III Clinical Database version 1.4 was used for external validation. The pipeline achieved an overall F1 score of 0.94 on the internal set and 0.74 on external validation with the rule-based algorithm using expert input giving the best performance. Among the machine learning models, the Bi-LSTM_dropout model performed better than the ML model using XGBoost and the Bi-LSTM_simple model on internal set, whereas on external validation, the Bi-LSTM_simple model performed relatively better than other 2. This pipeline can be used for clinical concept-based classification of radiology reports related to lung carcinoma from a huge corpus and also for automated annotation of these reports. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10278-023-00787-z. Springer International Publishing 2023-02-14 2023-06 /pmc/articles/PMC10287609/ /pubmed/36788196 http://dx.doi.org/10.1007/s10278-023-00787-z Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Mithun, Sneha
Jha, Ashish Kumar
Sherkhane, Umesh B.
Jaiswar, Vinay
Purandare, Nilendu C.
Dekker, Andre
Puts, Sander
Bermejo, Inigo
Rangarajan, V.
Zegers, Catharina M. L.
Wee, Leonard
Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma
title Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma
title_full Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma
title_fullStr Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma
title_full_unstemmed Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma
title_short Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma
title_sort clinical concept-based radiology reports classification pipeline for lung carcinoma
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10287609/
https://www.ncbi.nlm.nih.gov/pubmed/36788196
http://dx.doi.org/10.1007/s10278-023-00787-z
work_keys_str_mv AT mithunsneha clinicalconceptbasedradiologyreportsclassificationpipelineforlungcarcinoma
AT jhaashishkumar clinicalconceptbasedradiologyreportsclassificationpipelineforlungcarcinoma
AT sherkhaneumeshb clinicalconceptbasedradiologyreportsclassificationpipelineforlungcarcinoma
AT jaiswarvinay clinicalconceptbasedradiologyreportsclassificationpipelineforlungcarcinoma
AT purandarenilenduc clinicalconceptbasedradiologyreportsclassificationpipelineforlungcarcinoma
AT dekkerandre clinicalconceptbasedradiologyreportsclassificationpipelineforlungcarcinoma
AT putssander clinicalconceptbasedradiologyreportsclassificationpipelineforlungcarcinoma
AT bermejoinigo clinicalconceptbasedradiologyreportsclassificationpipelineforlungcarcinoma
AT rangarajanv clinicalconceptbasedradiologyreportsclassificationpipelineforlungcarcinoma
AT zegerscatharinaml clinicalconceptbasedradiologyreportsclassificationpipelineforlungcarcinoma
AT weeleonard clinicalconceptbasedradiologyreportsclassificationpipelineforlungcarcinoma