Cargando…

Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning

BACKGROUND: Extracting metastatic information from previous radiologic-text reports is important, however, laborious annotations have limited the usability of these texts. We developed a deep-learning model for extracting primary lung cancer sites and metastatic lymph nodes and distant metastasis in...

Descripción completa

Detalles Bibliográficos
Autores principales:	Park, Hyung Jun, Park, Namu, Lee, Jang Ho, Choi, Myeong Geun, Ryu, Jin-Sook, Song, Min, Choi, Chang-Min
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2022
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9438247/ https://www.ncbi.nlm.nih.gov/pubmed/36050674 http://dx.doi.org/10.1186/s12911-022-01975-7

_version_	1784781786442629120
author	Park, Hyung Jun Park, Namu Lee, Jang Ho Choi, Myeong Geun Ryu, Jin-Sook Song, Min Choi, Chang-Min
author_facet	Park, Hyung Jun Park, Namu Lee, Jang Ho Choi, Myeong Geun Ryu, Jin-Sook Song, Min Choi, Chang-Min
author_sort	Park, Hyung Jun
collection	PubMed
description	BACKGROUND: Extracting metastatic information from previous radiologic-text reports is important, however, laborious annotations have limited the usability of these texts. We developed a deep-learning model for extracting primary lung cancer sites and metastatic lymph nodes and distant metastasis information from PET-CT reports for determining lung cancer stages. METHODS: PET-CT reports, fully written in English, were acquired from two cohorts of patients with lung cancer who were diagnosed at a tertiary hospital between January 2004 and March 2020. One cohort of 20,466 PET-CT reports was used for training and the validation set, and the other cohort of 4190 PET-CT reports was used for an additional-test set. A pre-processing model (Lung Cancer Spell Checker) was applied to correct the typographical errors, and pseudo-labelling was used for training the model. The deep-learning model was constructed using the Convolutional-Recurrent Neural Network. The performance metrics for the prediction model were accuracy, precision, sensitivity, micro-AUROC, and AUPRC. RESULTS: For the extraction of primary lung cancer location, the model showed a micro-AUROC of 0.913 and 0.946 in the validation set and the additional-test set, respectively. For metastatic lymph nodes, the model showed a sensitivity of 0.827 and a specificity of 0.960. In predicting distant metastasis, the model showed a micro-AUROC of 0.944 and 0.950 in the validation and the additional-test set, respectively. CONCLUSION: Our deep-learning method could be used for extracting lung cancer stage information from PET-CT reports and may facilitate lung cancer studies by alleviating laborious annotation by clinicians. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-022-01975-7.
format	Online Article Text
id	pubmed-9438247
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-94382472022-09-03 Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning Park, Hyung Jun Park, Namu Lee, Jang Ho Choi, Myeong Geun Ryu, Jin-Sook Song, Min Choi, Chang-Min BMC Med Inform Decis Mak Research BACKGROUND: Extracting metastatic information from previous radiologic-text reports is important, however, laborious annotations have limited the usability of these texts. We developed a deep-learning model for extracting primary lung cancer sites and metastatic lymph nodes and distant metastasis information from PET-CT reports for determining lung cancer stages. METHODS: PET-CT reports, fully written in English, were acquired from two cohorts of patients with lung cancer who were diagnosed at a tertiary hospital between January 2004 and March 2020. One cohort of 20,466 PET-CT reports was used for training and the validation set, and the other cohort of 4190 PET-CT reports was used for an additional-test set. A pre-processing model (Lung Cancer Spell Checker) was applied to correct the typographical errors, and pseudo-labelling was used for training the model. The deep-learning model was constructed using the Convolutional-Recurrent Neural Network. The performance metrics for the prediction model were accuracy, precision, sensitivity, micro-AUROC, and AUPRC. RESULTS: For the extraction of primary lung cancer location, the model showed a micro-AUROC of 0.913 and 0.946 in the validation set and the additional-test set, respectively. For metastatic lymph nodes, the model showed a sensitivity of 0.827 and a specificity of 0.960. In predicting distant metastasis, the model showed a micro-AUROC of 0.944 and 0.950 in the validation and the additional-test set, respectively. CONCLUSION: Our deep-learning method could be used for extracting lung cancer stage information from PET-CT reports and may facilitate lung cancer studies by alleviating laborious annotation by clinicians. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-022-01975-7. BioMed Central 2022-09-01 /pmc/articles/PMC9438247/ /pubmed/36050674 http://dx.doi.org/10.1186/s12911-022-01975-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Park, Hyung Jun Park, Namu Lee, Jang Ho Choi, Myeong Geun Ryu, Jin-Sook Song, Min Choi, Chang-Min Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning
title	Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning
title_full	Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning
title_fullStr	Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning
title_full_unstemmed	Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning
title_short	Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning
title_sort	automated extraction of information of lung cancer staging from unstructured reports of pet-ct interpretation: natural language processing with deep-learning
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9438247/ https://www.ncbi.nlm.nih.gov/pubmed/36050674 http://dx.doi.org/10.1186/s12911-022-01975-7
work_keys_str_mv	AT parkhyungjun automatedextractionofinformationoflungcancerstagingfromunstructuredreportsofpetctinterpretationnaturallanguageprocessingwithdeeplearning AT parknamu automatedextractionofinformationoflungcancerstagingfromunstructuredreportsofpetctinterpretationnaturallanguageprocessingwithdeeplearning AT leejangho automatedextractionofinformationoflungcancerstagingfromunstructuredreportsofpetctinterpretationnaturallanguageprocessingwithdeeplearning AT choimyeonggeun automatedextractionofinformationoflungcancerstagingfromunstructuredreportsofpetctinterpretationnaturallanguageprocessingwithdeeplearning AT ryujinsook automatedextractionofinformationoflungcancerstagingfromunstructuredreportsofpetctinterpretationnaturallanguageprocessingwithdeeplearning AT songmin automatedextractionofinformationoflungcancerstagingfromunstructuredreportsofpetctinterpretationnaturallanguageprocessingwithdeeplearning AT choichangmin automatedextractionofinformationoflungcancerstagingfromunstructuredreportsofpetctinterpretationnaturallanguageprocessingwithdeeplearning

Automated extraction of information of lung cancer staging from unstructured reports of PET-CT interpretation: natural language processing with deep-learning

Ejemplares similares