Cargando…

Prediction of pathologic stage in non-small cell lung cancer using machine learning algorithm based on CT image feature analysis

PURPOSE: To explore imaging biomarkers that can be used for diagnosis and prediction of pathologic stage in non-small cell lung cancer (NSCLC) using multiple machine learning algorithms based on CT image feature analysis. METHODS: Patients with stage IA to IV NSCLC were included, and the whole datas...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Lingming, Tao, Guangyu, Zhu, Lei, Wang, Gang, Li, Ziming, Ye, Jianding, Chen, Qunhui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6525347/
https://www.ncbi.nlm.nih.gov/pubmed/31101024
http://dx.doi.org/10.1186/s12885-019-5646-9
_version_ 1783419708255502336
author Yu, Lingming
Tao, Guangyu
Zhu, Lei
Wang, Gang
Li, Ziming
Ye, Jianding
Chen, Qunhui
author_facet Yu, Lingming
Tao, Guangyu
Zhu, Lei
Wang, Gang
Li, Ziming
Ye, Jianding
Chen, Qunhui
author_sort Yu, Lingming
collection PubMed
description PURPOSE: To explore imaging biomarkers that can be used for diagnosis and prediction of pathologic stage in non-small cell lung cancer (NSCLC) using multiple machine learning algorithms based on CT image feature analysis. METHODS: Patients with stage IA to IV NSCLC were included, and the whole dataset was divided into training and testing sets and an external validation set. To tackle imbalanced datasets in NSCLC, we generated a new dataset and achieved equilibrium of class distribution by using SMOTE algorithm. The datasets were randomly split up into a training/testing set. We calculated the importance value of CT image features by means of mean decrease gini impurity generated by random forest algorithm and selected optimal features according to feature importance (mean decrease gini impurity > 0.005). The performance of prediction model in training and testing sets were evaluated from the perspectives of classification accuracy, average precision (AP) score and precision-recall curve. The predictive accuracy of the model was externally validated using lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) samples from TCGA database. RESULTS: The prediction model that incorporated nine image features exhibited a high classification accuracy, precision and recall scores in the training and testing sets. In the external validation, the predictive accuracy of the model in LUAD outperformed that in LUSC. CONCLUSIONS: The pathologic stage of patients with NSCLC can be accurately predicted based on CT image features, especially for LUAD. Our findings extend the application of machine learning algorithms in CT image feature prediction for pathologic staging and identify potential imaging biomarkers that can be used for diagnosis of pathologic stage in NSCLC patients. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12885-019-5646-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6525347
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-65253472019-05-24 Prediction of pathologic stage in non-small cell lung cancer using machine learning algorithm based on CT image feature analysis Yu, Lingming Tao, Guangyu Zhu, Lei Wang, Gang Li, Ziming Ye, Jianding Chen, Qunhui BMC Cancer Research Article PURPOSE: To explore imaging biomarkers that can be used for diagnosis and prediction of pathologic stage in non-small cell lung cancer (NSCLC) using multiple machine learning algorithms based on CT image feature analysis. METHODS: Patients with stage IA to IV NSCLC were included, and the whole dataset was divided into training and testing sets and an external validation set. To tackle imbalanced datasets in NSCLC, we generated a new dataset and achieved equilibrium of class distribution by using SMOTE algorithm. The datasets were randomly split up into a training/testing set. We calculated the importance value of CT image features by means of mean decrease gini impurity generated by random forest algorithm and selected optimal features according to feature importance (mean decrease gini impurity > 0.005). The performance of prediction model in training and testing sets were evaluated from the perspectives of classification accuracy, average precision (AP) score and precision-recall curve. The predictive accuracy of the model was externally validated using lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) samples from TCGA database. RESULTS: The prediction model that incorporated nine image features exhibited a high classification accuracy, precision and recall scores in the training and testing sets. In the external validation, the predictive accuracy of the model in LUAD outperformed that in LUSC. CONCLUSIONS: The pathologic stage of patients with NSCLC can be accurately predicted based on CT image features, especially for LUAD. Our findings extend the application of machine learning algorithms in CT image feature prediction for pathologic staging and identify potential imaging biomarkers that can be used for diagnosis of pathologic stage in NSCLC patients. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12885-019-5646-9) contains supplementary material, which is available to authorized users. BioMed Central 2019-05-17 /pmc/articles/PMC6525347/ /pubmed/31101024 http://dx.doi.org/10.1186/s12885-019-5646-9 Text en © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Yu, Lingming
Tao, Guangyu
Zhu, Lei
Wang, Gang
Li, Ziming
Ye, Jianding
Chen, Qunhui
Prediction of pathologic stage in non-small cell lung cancer using machine learning algorithm based on CT image feature analysis
title Prediction of pathologic stage in non-small cell lung cancer using machine learning algorithm based on CT image feature analysis
title_full Prediction of pathologic stage in non-small cell lung cancer using machine learning algorithm based on CT image feature analysis
title_fullStr Prediction of pathologic stage in non-small cell lung cancer using machine learning algorithm based on CT image feature analysis
title_full_unstemmed Prediction of pathologic stage in non-small cell lung cancer using machine learning algorithm based on CT image feature analysis
title_short Prediction of pathologic stage in non-small cell lung cancer using machine learning algorithm based on CT image feature analysis
title_sort prediction of pathologic stage in non-small cell lung cancer using machine learning algorithm based on ct image feature analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6525347/
https://www.ncbi.nlm.nih.gov/pubmed/31101024
http://dx.doi.org/10.1186/s12885-019-5646-9
work_keys_str_mv AT yulingming predictionofpathologicstageinnonsmallcelllungcancerusingmachinelearningalgorithmbasedonctimagefeatureanalysis
AT taoguangyu predictionofpathologicstageinnonsmallcelllungcancerusingmachinelearningalgorithmbasedonctimagefeatureanalysis
AT zhulei predictionofpathologicstageinnonsmallcelllungcancerusingmachinelearningalgorithmbasedonctimagefeatureanalysis
AT wanggang predictionofpathologicstageinnonsmallcelllungcancerusingmachinelearningalgorithmbasedonctimagefeatureanalysis
AT liziming predictionofpathologicstageinnonsmallcelllungcancerusingmachinelearningalgorithmbasedonctimagefeatureanalysis
AT yejianding predictionofpathologicstageinnonsmallcelllungcancerusingmachinelearningalgorithmbasedonctimagefeatureanalysis
AT chenqunhui predictionofpathologicstageinnonsmallcelllungcancerusingmachinelearningalgorithmbasedonctimagefeatureanalysis