Cargando…

Benign‐malignant classification of pulmonary nodules by low‐dose spiral computerized tomography and clinical data with machine learning in opportunistic screening

BACKGROUND: Many people were found with pulmonary nodules during physical examinations. It is of great practical significance to discriminate benign and malignant nodules by using data mining technology. METHODS: The subjects' demographic data, baseline examination results, and annual follow‐up...

Descripción completa

Detalles Bibliográficos
Autores principales: Zheng, Yansong, Dong, Jing, Yang, Xue, Shuai, Ping, Li, Yongli, Li, Hailin, Dong, Shengyong, Gong, Yan, Liu, Miao, Zeng, Qiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10278478/
https://www.ncbi.nlm.nih.gov/pubmed/37248730
http://dx.doi.org/10.1002/cam4.5886
_version_ 1785060494707523584
author Zheng, Yansong
Dong, Jing
Yang, Xue
Shuai, Ping
Li, Yongli
Li, Hailin
Dong, Shengyong
Gong, Yan
Liu, Miao
Zeng, Qiang
author_facet Zheng, Yansong
Dong, Jing
Yang, Xue
Shuai, Ping
Li, Yongli
Li, Hailin
Dong, Shengyong
Gong, Yan
Liu, Miao
Zeng, Qiang
author_sort Zheng, Yansong
collection PubMed
description BACKGROUND: Many people were found with pulmonary nodules during physical examinations. It is of great practical significance to discriminate benign and malignant nodules by using data mining technology. METHODS: The subjects' demographic data, baseline examination results, and annual follow‐up low‐dose spiral computerized tomography (LDCT) results were recorded. The findings from annual physical examinations of positive nodules, including highly suspicious nodules and clinically tentative benign nodules, was analyzed. The extreme gradient boosting (XGBoost) model was constructed and the Grid Search CV method was used to select the super parameters. External unit data were used as an external validation set to evaluate the generalization performance of the model. RESULTS: A total of 135,503 physical examinees were enrolled. Baseline testing found that 27,636 (20.40%) participants had clinically tentative benign nodules and 611 (0.45%) participants had highly suspicious nodules. The proportion of highly suspicious nodules in participants with negative baseline was about 0.12%–0.46%, which was lower than the baseline level except the follow‐up of >5 years. In the 27,636 participants with clinically tentative benign nodules, only in the first year of LDCT re‐examination was the proportion of highly suspicious nodules (1.40%) significantly greater than that of baseline screening (0.45%) (p < 0.001), and the proportion of highly suspicious nodules was not different between the baseline screening and other follow‐up years (p > 0.05). Furthermore, 322 cases with benign nodules and 196 patients with malignant nodules confirmed by surgery and pathology were compared. A model and the top 15 most important clinical variables were determined by XGBoost algorithm. The area under the curve (AUC) of the model was 0.76 [95% CI: 0.67–0.84], and the accuracy was 0.75. The sensitivity and specificity of the model under this threshold were 0.78 and 0.73, respectively. In the validation of model using external data, the AUC was 0.87 and the accuracy was 0.80. The sensitivity and specificity were 0.83 and 0.77, respectively. CONCLUSIONS: It is important that pulmonary nodules could be more accurately identified at the first LDCT examination. A model with 15 variables which are routinely measured in the clinic could be helpful to distinguish benign and malignant nodules. It could help the radiological team issue a more accurate report; and it may guide the clinical team regarding LDCT follow‐up.
format Online
Article
Text
id pubmed-10278478
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-102784782023-06-20 Benign‐malignant classification of pulmonary nodules by low‐dose spiral computerized tomography and clinical data with machine learning in opportunistic screening Zheng, Yansong Dong, Jing Yang, Xue Shuai, Ping Li, Yongli Li, Hailin Dong, Shengyong Gong, Yan Liu, Miao Zeng, Qiang Cancer Med RESEARCH ARTICLES BACKGROUND: Many people were found with pulmonary nodules during physical examinations. It is of great practical significance to discriminate benign and malignant nodules by using data mining technology. METHODS: The subjects' demographic data, baseline examination results, and annual follow‐up low‐dose spiral computerized tomography (LDCT) results were recorded. The findings from annual physical examinations of positive nodules, including highly suspicious nodules and clinically tentative benign nodules, was analyzed. The extreme gradient boosting (XGBoost) model was constructed and the Grid Search CV method was used to select the super parameters. External unit data were used as an external validation set to evaluate the generalization performance of the model. RESULTS: A total of 135,503 physical examinees were enrolled. Baseline testing found that 27,636 (20.40%) participants had clinically tentative benign nodules and 611 (0.45%) participants had highly suspicious nodules. The proportion of highly suspicious nodules in participants with negative baseline was about 0.12%–0.46%, which was lower than the baseline level except the follow‐up of >5 years. In the 27,636 participants with clinically tentative benign nodules, only in the first year of LDCT re‐examination was the proportion of highly suspicious nodules (1.40%) significantly greater than that of baseline screening (0.45%) (p < 0.001), and the proportion of highly suspicious nodules was not different between the baseline screening and other follow‐up years (p > 0.05). Furthermore, 322 cases with benign nodules and 196 patients with malignant nodules confirmed by surgery and pathology were compared. A model and the top 15 most important clinical variables were determined by XGBoost algorithm. The area under the curve (AUC) of the model was 0.76 [95% CI: 0.67–0.84], and the accuracy was 0.75. The sensitivity and specificity of the model under this threshold were 0.78 and 0.73, respectively. In the validation of model using external data, the AUC was 0.87 and the accuracy was 0.80. The sensitivity and specificity were 0.83 and 0.77, respectively. CONCLUSIONS: It is important that pulmonary nodules could be more accurately identified at the first LDCT examination. A model with 15 variables which are routinely measured in the clinic could be helpful to distinguish benign and malignant nodules. It could help the radiological team issue a more accurate report; and it may guide the clinical team regarding LDCT follow‐up. John Wiley and Sons Inc. 2023-05-29 /pmc/articles/PMC10278478/ /pubmed/37248730 http://dx.doi.org/10.1002/cam4.5886 Text en © 2023 The Authors. Cancer Medicine published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle RESEARCH ARTICLES
Zheng, Yansong
Dong, Jing
Yang, Xue
Shuai, Ping
Li, Yongli
Li, Hailin
Dong, Shengyong
Gong, Yan
Liu, Miao
Zeng, Qiang
Benign‐malignant classification of pulmonary nodules by low‐dose spiral computerized tomography and clinical data with machine learning in opportunistic screening
title Benign‐malignant classification of pulmonary nodules by low‐dose spiral computerized tomography and clinical data with machine learning in opportunistic screening
title_full Benign‐malignant classification of pulmonary nodules by low‐dose spiral computerized tomography and clinical data with machine learning in opportunistic screening
title_fullStr Benign‐malignant classification of pulmonary nodules by low‐dose spiral computerized tomography and clinical data with machine learning in opportunistic screening
title_full_unstemmed Benign‐malignant classification of pulmonary nodules by low‐dose spiral computerized tomography and clinical data with machine learning in opportunistic screening
title_short Benign‐malignant classification of pulmonary nodules by low‐dose spiral computerized tomography and clinical data with machine learning in opportunistic screening
title_sort benign‐malignant classification of pulmonary nodules by low‐dose spiral computerized tomography and clinical data with machine learning in opportunistic screening
topic RESEARCH ARTICLES
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10278478/
https://www.ncbi.nlm.nih.gov/pubmed/37248730
http://dx.doi.org/10.1002/cam4.5886
work_keys_str_mv AT zhengyansong benignmalignantclassificationofpulmonarynodulesbylowdosespiralcomputerizedtomographyandclinicaldatawithmachinelearninginopportunisticscreening
AT dongjing benignmalignantclassificationofpulmonarynodulesbylowdosespiralcomputerizedtomographyandclinicaldatawithmachinelearninginopportunisticscreening
AT yangxue benignmalignantclassificationofpulmonarynodulesbylowdosespiralcomputerizedtomographyandclinicaldatawithmachinelearninginopportunisticscreening
AT shuaiping benignmalignantclassificationofpulmonarynodulesbylowdosespiralcomputerizedtomographyandclinicaldatawithmachinelearninginopportunisticscreening
AT liyongli benignmalignantclassificationofpulmonarynodulesbylowdosespiralcomputerizedtomographyandclinicaldatawithmachinelearninginopportunisticscreening
AT lihailin benignmalignantclassificationofpulmonarynodulesbylowdosespiralcomputerizedtomographyandclinicaldatawithmachinelearninginopportunisticscreening
AT dongshengyong benignmalignantclassificationofpulmonarynodulesbylowdosespiralcomputerizedtomographyandclinicaldatawithmachinelearninginopportunisticscreening
AT gongyan benignmalignantclassificationofpulmonarynodulesbylowdosespiralcomputerizedtomographyandclinicaldatawithmachinelearninginopportunisticscreening
AT liumiao benignmalignantclassificationofpulmonarynodulesbylowdosespiralcomputerizedtomographyandclinicaldatawithmachinelearninginopportunisticscreening
AT zengqiang benignmalignantclassificationofpulmonarynodulesbylowdosespiralcomputerizedtomographyandclinicaldatawithmachinelearninginopportunisticscreening