Cargando…

Machine Learning and Feature Selection Methods for Disease Classification With Application to Lung Cancer Screening Image Data

As awareness of the habits and risks associated with lung cancer has increased, so has the interest in promoting and improving upon lung cancer screening procedures. Recent research demonstrates the benefits of lung cancer screening; the National Lung Screening Trial (NLST) found as its primary resu...

Descripción completa

Detalles Bibliográficos
Autores principales: Delzell, Darcie A. P., Magnuson, Sara, Peter, Tabitha, Smith, Michelle, Smith, Brian J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6917601/
https://www.ncbi.nlm.nih.gov/pubmed/31921650
http://dx.doi.org/10.3389/fonc.2019.01393
_version_ 1783480433460117504
author Delzell, Darcie A. P.
Magnuson, Sara
Peter, Tabitha
Smith, Michelle
Smith, Brian J.
author_facet Delzell, Darcie A. P.
Magnuson, Sara
Peter, Tabitha
Smith, Michelle
Smith, Brian J.
author_sort Delzell, Darcie A. P.
collection PubMed
description As awareness of the habits and risks associated with lung cancer has increased, so has the interest in promoting and improving upon lung cancer screening procedures. Recent research demonstrates the benefits of lung cancer screening; the National Lung Screening Trial (NLST) found as its primary result that preventative screening significantly decreases the death rate for patients battling lung cancer. However, it was also noted that the false positive rate was very high (>94%).In this work, we investigated the ability of various machine learning classifiers to accurately predict lung cancer nodule status while also considering the associated false positive rate. We utilized 416 quantitative imaging biomarkers taken from CT scans of lung nodules from 200 patients, where the nodules had been verified as cancerous or benign. These imaging biomarkers were created from both nodule and parenchymal tissue. A variety of linear, nonlinear, and ensemble predictive classifying models, along with several feature selection methods, were used to classify the binary outcome of malignant or benign status. Elastic net and support vector machine, combined with either a linear combination or correlation feature selection method, were some of the best-performing classifiers (average cross-validation AUC near 0.72 for these models), while random forest and bagged trees were the worst performing classifiers (AUC near 0.60). For the best performing models, the false positive rate was near 30%, notably lower than that reported in the NLST.The use of radiomic biomarkers with machine learning methods are a promising diagnostic tool for tumor classification. The have the potential to provide good classification and simultaneously reduce the false positive rate.
format Online
Article
Text
id pubmed-6917601
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-69176012020-01-09 Machine Learning and Feature Selection Methods for Disease Classification With Application to Lung Cancer Screening Image Data Delzell, Darcie A. P. Magnuson, Sara Peter, Tabitha Smith, Michelle Smith, Brian J. Front Oncol Oncology As awareness of the habits and risks associated with lung cancer has increased, so has the interest in promoting and improving upon lung cancer screening procedures. Recent research demonstrates the benefits of lung cancer screening; the National Lung Screening Trial (NLST) found as its primary result that preventative screening significantly decreases the death rate for patients battling lung cancer. However, it was also noted that the false positive rate was very high (>94%).In this work, we investigated the ability of various machine learning classifiers to accurately predict lung cancer nodule status while also considering the associated false positive rate. We utilized 416 quantitative imaging biomarkers taken from CT scans of lung nodules from 200 patients, where the nodules had been verified as cancerous or benign. These imaging biomarkers were created from both nodule and parenchymal tissue. A variety of linear, nonlinear, and ensemble predictive classifying models, along with several feature selection methods, were used to classify the binary outcome of malignant or benign status. Elastic net and support vector machine, combined with either a linear combination or correlation feature selection method, were some of the best-performing classifiers (average cross-validation AUC near 0.72 for these models), while random forest and bagged trees were the worst performing classifiers (AUC near 0.60). For the best performing models, the false positive rate was near 30%, notably lower than that reported in the NLST.The use of radiomic biomarkers with machine learning methods are a promising diagnostic tool for tumor classification. The have the potential to provide good classification and simultaneously reduce the false positive rate. Frontiers Media S.A. 2019-12-11 /pmc/articles/PMC6917601/ /pubmed/31921650 http://dx.doi.org/10.3389/fonc.2019.01393 Text en Copyright © 2019 Delzell, Magnuson, Peter, Smith and Smith. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Oncology
Delzell, Darcie A. P.
Magnuson, Sara
Peter, Tabitha
Smith, Michelle
Smith, Brian J.
Machine Learning and Feature Selection Methods for Disease Classification With Application to Lung Cancer Screening Image Data
title Machine Learning and Feature Selection Methods for Disease Classification With Application to Lung Cancer Screening Image Data
title_full Machine Learning and Feature Selection Methods for Disease Classification With Application to Lung Cancer Screening Image Data
title_fullStr Machine Learning and Feature Selection Methods for Disease Classification With Application to Lung Cancer Screening Image Data
title_full_unstemmed Machine Learning and Feature Selection Methods for Disease Classification With Application to Lung Cancer Screening Image Data
title_short Machine Learning and Feature Selection Methods for Disease Classification With Application to Lung Cancer Screening Image Data
title_sort machine learning and feature selection methods for disease classification with application to lung cancer screening image data
topic Oncology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6917601/
https://www.ncbi.nlm.nih.gov/pubmed/31921650
http://dx.doi.org/10.3389/fonc.2019.01393
work_keys_str_mv AT delzelldarcieap machinelearningandfeatureselectionmethodsfordiseaseclassificationwithapplicationtolungcancerscreeningimagedata
AT magnusonsara machinelearningandfeatureselectionmethodsfordiseaseclassificationwithapplicationtolungcancerscreeningimagedata
AT petertabitha machinelearningandfeatureselectionmethodsfordiseaseclassificationwithapplicationtolungcancerscreeningimagedata
AT smithmichelle machinelearningandfeatureselectionmethodsfordiseaseclassificationwithapplicationtolungcancerscreeningimagedata
AT smithbrianj machinelearningandfeatureselectionmethodsfordiseaseclassificationwithapplicationtolungcancerscreeningimagedata