Cargando…

How Many Private Data Are Needed for Deep Learning in Lung Nodule Detection on CT Scans? A Retrospective Multicenter Study

SIMPLE SUMMARY: The early detection of lung nodules is important for patient treatment and follow-up. Many researchers are investigating deep-learning-based lung nodule detection to ease the burden of lung nodule detection by radiologists. The purpose of this paper is to provide guidelines for colle...

Descripción completa

Detalles Bibliográficos
Autores principales: Son, Jeong Woo, Hong, Ji Young, Kim, Yoon, Kim, Woo Jin, Shin, Dae-Yong, Choi, Hyun-Soo, Bak, So Hyeon, Moon, Kyoung Min
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9265117/
https://www.ncbi.nlm.nih.gov/pubmed/35804946
http://dx.doi.org/10.3390/cancers14133174
_version_ 1784743133164077056
author Son, Jeong Woo
Hong, Ji Young
Kim, Yoon
Kim, Woo Jin
Shin, Dae-Yong
Choi, Hyun-Soo
Bak, So Hyeon
Moon, Kyoung Min
author_facet Son, Jeong Woo
Hong, Ji Young
Kim, Yoon
Kim, Woo Jin
Shin, Dae-Yong
Choi, Hyun-Soo
Bak, So Hyeon
Moon, Kyoung Min
author_sort Son, Jeong Woo
collection PubMed
description SIMPLE SUMMARY: The early detection of lung nodules is important for patient treatment and follow-up. Many researchers are investigating deep-learning-based lung nodule detection to ease the burden of lung nodule detection by radiologists. The purpose of this paper is to provide guidelines for collecting lung nodule data to facilitate research. We collected chest computed tomography scans reviewed by radiologists at three hospitals. In addition, several experiments were conducted using the large-scale open dataset, LUNA16. As a result of the experiment, it was possible to prove the value of using the collected data compared to using LUNA16. We also demonstrated the effectiveness of transfer learning from pre-trained learning weights in LUNA16. Finally, our study provides information on the amount of lung nodule data that must be collected to stabilize lung nodule detection performance. ABSTRACT: Early detection of lung nodules is essential for preventing lung cancer. However, the number of radiologists who can diagnose lung nodules is limited, and considerable effort and time are required. To address this problem, researchers are investigating the automation of deep-learning-based lung nodule detection. However, deep learning requires large amounts of data, which can be difficult to collect. Therefore, data collection should be optimized to facilitate experiments at the beginning of lung nodule detection studies. We collected chest computed tomography scans from 515 patients with lung nodules from three hospitals and high-quality lung nodule annotations reviewed by radiologists. We conducted several experiments using the collected datasets and publicly available data from LUNA16. The object detection model, YOLOX was used in the lung nodule detection experiment. Similar or better performance was obtained when training the model with the collected data rather than LUNA16 with large amounts of data. We also show that weight transfer learning from pre-trained open data is very useful when it is difficult to collect large amounts of data. Good performance can otherwise be expected when reaching more than 100 patients. This study offers valuable insights for guiding data collection in lung nodules studies in the future.
format Online
Article
Text
id pubmed-9265117
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-92651172022-07-09 How Many Private Data Are Needed for Deep Learning in Lung Nodule Detection on CT Scans? A Retrospective Multicenter Study Son, Jeong Woo Hong, Ji Young Kim, Yoon Kim, Woo Jin Shin, Dae-Yong Choi, Hyun-Soo Bak, So Hyeon Moon, Kyoung Min Cancers (Basel) Article SIMPLE SUMMARY: The early detection of lung nodules is important for patient treatment and follow-up. Many researchers are investigating deep-learning-based lung nodule detection to ease the burden of lung nodule detection by radiologists. The purpose of this paper is to provide guidelines for collecting lung nodule data to facilitate research. We collected chest computed tomography scans reviewed by radiologists at three hospitals. In addition, several experiments were conducted using the large-scale open dataset, LUNA16. As a result of the experiment, it was possible to prove the value of using the collected data compared to using LUNA16. We also demonstrated the effectiveness of transfer learning from pre-trained learning weights in LUNA16. Finally, our study provides information on the amount of lung nodule data that must be collected to stabilize lung nodule detection performance. ABSTRACT: Early detection of lung nodules is essential for preventing lung cancer. However, the number of radiologists who can diagnose lung nodules is limited, and considerable effort and time are required. To address this problem, researchers are investigating the automation of deep-learning-based lung nodule detection. However, deep learning requires large amounts of data, which can be difficult to collect. Therefore, data collection should be optimized to facilitate experiments at the beginning of lung nodule detection studies. We collected chest computed tomography scans from 515 patients with lung nodules from three hospitals and high-quality lung nodule annotations reviewed by radiologists. We conducted several experiments using the collected datasets and publicly available data from LUNA16. The object detection model, YOLOX was used in the lung nodule detection experiment. Similar or better performance was obtained when training the model with the collected data rather than LUNA16 with large amounts of data. We also show that weight transfer learning from pre-trained open data is very useful when it is difficult to collect large amounts of data. Good performance can otherwise be expected when reaching more than 100 patients. This study offers valuable insights for guiding data collection in lung nodules studies in the future. MDPI 2022-06-28 /pmc/articles/PMC9265117/ /pubmed/35804946 http://dx.doi.org/10.3390/cancers14133174 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Son, Jeong Woo
Hong, Ji Young
Kim, Yoon
Kim, Woo Jin
Shin, Dae-Yong
Choi, Hyun-Soo
Bak, So Hyeon
Moon, Kyoung Min
How Many Private Data Are Needed for Deep Learning in Lung Nodule Detection on CT Scans? A Retrospective Multicenter Study
title How Many Private Data Are Needed for Deep Learning in Lung Nodule Detection on CT Scans? A Retrospective Multicenter Study
title_full How Many Private Data Are Needed for Deep Learning in Lung Nodule Detection on CT Scans? A Retrospective Multicenter Study
title_fullStr How Many Private Data Are Needed for Deep Learning in Lung Nodule Detection on CT Scans? A Retrospective Multicenter Study
title_full_unstemmed How Many Private Data Are Needed for Deep Learning in Lung Nodule Detection on CT Scans? A Retrospective Multicenter Study
title_short How Many Private Data Are Needed for Deep Learning in Lung Nodule Detection on CT Scans? A Retrospective Multicenter Study
title_sort how many private data are needed for deep learning in lung nodule detection on ct scans? a retrospective multicenter study
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9265117/
https://www.ncbi.nlm.nih.gov/pubmed/35804946
http://dx.doi.org/10.3390/cancers14133174
work_keys_str_mv AT sonjeongwoo howmanyprivatedataareneededfordeeplearninginlungnoduledetectiononctscansaretrospectivemulticenterstudy
AT hongjiyoung howmanyprivatedataareneededfordeeplearninginlungnoduledetectiononctscansaretrospectivemulticenterstudy
AT kimyoon howmanyprivatedataareneededfordeeplearninginlungnoduledetectiononctscansaretrospectivemulticenterstudy
AT kimwoojin howmanyprivatedataareneededfordeeplearninginlungnoduledetectiononctscansaretrospectivemulticenterstudy
AT shindaeyong howmanyprivatedataareneededfordeeplearninginlungnoduledetectiononctscansaretrospectivemulticenterstudy
AT choihyunsoo howmanyprivatedataareneededfordeeplearninginlungnoduledetectiononctscansaretrospectivemulticenterstudy
AT baksohyeon howmanyprivatedataareneededfordeeplearninginlungnoduledetectiononctscansaretrospectivemulticenterstudy
AT moonkyoungmin howmanyprivatedataareneededfordeeplearninginlungnoduledetectiononctscansaretrospectivemulticenterstudy