Cargando…

Non-invasive classification of non-small cell lung cancer: a comparison between random forest models utilising radiomic and semantic features

OBJECTIVE: Non-invasive distinction between squamous cell carcinoma and adenocarcinoma subtypes of non-small-cell lung cancer (NSCLC) may be beneficial to patients unfit for invasive diagnostic procedures or when tissue is insufficient for diagnosis. The purpose of our study was to compare the perfo...

Descripción completa

Detalles Bibliográficos
Autores principales: Bashir, Usman, Kawa, Bhavin, Siddique, Muhammad, Mak, Sze Mun, Nair, Arjun, Mclean, Emma, Bille, Andrea, Goh, Vicky, Cook, Gary
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The British Institute of Radiology. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6636267/
https://www.ncbi.nlm.nih.gov/pubmed/31166787
http://dx.doi.org/10.1259/bjr.20190159
_version_ 1783436034811363328
author Bashir, Usman
Kawa, Bhavin
Siddique, Muhammad
Mak, Sze Mun
Nair, Arjun
Mclean, Emma
Bille, Andrea
Goh, Vicky
Cook, Gary
author_facet Bashir, Usman
Kawa, Bhavin
Siddique, Muhammad
Mak, Sze Mun
Nair, Arjun
Mclean, Emma
Bille, Andrea
Goh, Vicky
Cook, Gary
author_sort Bashir, Usman
collection PubMed
description OBJECTIVE: Non-invasive distinction between squamous cell carcinoma and adenocarcinoma subtypes of non-small-cell lung cancer (NSCLC) may be beneficial to patients unfit for invasive diagnostic procedures or when tissue is insufficient for diagnosis. The purpose of our study was to compare the performance of random forest algorithms utilizing CT radiomics and/or semantic features in classifying NSCLC. METHODS: Two thoracic radiologists scored 11 semantic features on CT scans of 106 patients with NSCLC. A set of 115 radiomics features was extracted from the CT scans. Random forest models were developed from semantic (RM-sem), radiomics (RM-rad), and all features combined (RM-all). External validation of models was performed using an independent test data set (n = 100) of CT scans. Model performance was measured with out-of-bag error and area under curve (AUC), and compared using receiver-operating characteristics curve analysis on the test data set. RESULTS: The median (interquartile-range) error rates of the models were: RF-sem 24.5 % (22.6 – 37.5 %), RF-rad 35.8 % (34.9 – 38.7 %), and RM-all 37.7 % (37.7 – 37.7). On training data, both RF-rad and RF-all gave perfect discrimination (AUC = 1), which was significantly higher than that achieved by RF-sem (AUC = 0.78; p < 0.0001). On test data, however, RM-sem model (AUC = 0.82) out-performed RM-rad and RM-all (AUC = 0.5 and AUC = 0.56; p < 0.0001), neither of which was significantly different from random guess ( p = 0.9 and 0.6 respectively). CONCLUSION: Non-invasive classification of NSCLC can be done accurately using random forest classification models based on well-known CT-derived descriptive features. However, radiomics-based classification models performed poorly in this scenario when tested on independent data and should be used with caution, due to their possible lack of generalizability to new data. ADVANCES IN KNOWLEDGE: Our study describes novel CT-derived random forest models based on radiologist-interpretation of CT scans (semantic features) that can assist NSCLC classification when histopathology is equivocal or when histopathological sampling is not possible. It also shows that random forest models based on semantic features may be more useful than those built from computational radiomic features.
format Online
Article
Text
id pubmed-6636267
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher The British Institute of Radiology.
record_format MEDLINE/PubMed
spelling pubmed-66362672019-10-23 Non-invasive classification of non-small cell lung cancer: a comparison between random forest models utilising radiomic and semantic features Bashir, Usman Kawa, Bhavin Siddique, Muhammad Mak, Sze Mun Nair, Arjun Mclean, Emma Bille, Andrea Goh, Vicky Cook, Gary Br J Radiol Full Paper OBJECTIVE: Non-invasive distinction between squamous cell carcinoma and adenocarcinoma subtypes of non-small-cell lung cancer (NSCLC) may be beneficial to patients unfit for invasive diagnostic procedures or when tissue is insufficient for diagnosis. The purpose of our study was to compare the performance of random forest algorithms utilizing CT radiomics and/or semantic features in classifying NSCLC. METHODS: Two thoracic radiologists scored 11 semantic features on CT scans of 106 patients with NSCLC. A set of 115 radiomics features was extracted from the CT scans. Random forest models were developed from semantic (RM-sem), radiomics (RM-rad), and all features combined (RM-all). External validation of models was performed using an independent test data set (n = 100) of CT scans. Model performance was measured with out-of-bag error and area under curve (AUC), and compared using receiver-operating characteristics curve analysis on the test data set. RESULTS: The median (interquartile-range) error rates of the models were: RF-sem 24.5 % (22.6 – 37.5 %), RF-rad 35.8 % (34.9 – 38.7 %), and RM-all 37.7 % (37.7 – 37.7). On training data, both RF-rad and RF-all gave perfect discrimination (AUC = 1), which was significantly higher than that achieved by RF-sem (AUC = 0.78; p < 0.0001). On test data, however, RM-sem model (AUC = 0.82) out-performed RM-rad and RM-all (AUC = 0.5 and AUC = 0.56; p < 0.0001), neither of which was significantly different from random guess ( p = 0.9 and 0.6 respectively). CONCLUSION: Non-invasive classification of NSCLC can be done accurately using random forest classification models based on well-known CT-derived descriptive features. However, radiomics-based classification models performed poorly in this scenario when tested on independent data and should be used with caution, due to their possible lack of generalizability to new data. ADVANCES IN KNOWLEDGE: Our study describes novel CT-derived random forest models based on radiologist-interpretation of CT scans (semantic features) that can assist NSCLC classification when histopathology is equivocal or when histopathological sampling is not possible. It also shows that random forest models based on semantic features may be more useful than those built from computational radiomic features. The British Institute of Radiology. 2019-07 2019-06-03 /pmc/articles/PMC6636267/ /pubmed/31166787 http://dx.doi.org/10.1259/bjr.20190159 Text en © 2019 The Authors. Published by the British Institute of Radiology This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 Unported License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.
spellingShingle Full Paper
Bashir, Usman
Kawa, Bhavin
Siddique, Muhammad
Mak, Sze Mun
Nair, Arjun
Mclean, Emma
Bille, Andrea
Goh, Vicky
Cook, Gary
Non-invasive classification of non-small cell lung cancer: a comparison between random forest models utilising radiomic and semantic features
title Non-invasive classification of non-small cell lung cancer: a comparison between random forest models utilising radiomic and semantic features
title_full Non-invasive classification of non-small cell lung cancer: a comparison between random forest models utilising radiomic and semantic features
title_fullStr Non-invasive classification of non-small cell lung cancer: a comparison between random forest models utilising radiomic and semantic features
title_full_unstemmed Non-invasive classification of non-small cell lung cancer: a comparison between random forest models utilising radiomic and semantic features
title_short Non-invasive classification of non-small cell lung cancer: a comparison between random forest models utilising radiomic and semantic features
title_sort non-invasive classification of non-small cell lung cancer: a comparison between random forest models utilising radiomic and semantic features
topic Full Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6636267/
https://www.ncbi.nlm.nih.gov/pubmed/31166787
http://dx.doi.org/10.1259/bjr.20190159
work_keys_str_mv AT bashirusman noninvasiveclassificationofnonsmallcelllungcanceracomparisonbetweenrandomforestmodelsutilisingradiomicandsemanticfeatures
AT kawabhavin noninvasiveclassificationofnonsmallcelllungcanceracomparisonbetweenrandomforestmodelsutilisingradiomicandsemanticfeatures
AT siddiquemuhammad noninvasiveclassificationofnonsmallcelllungcanceracomparisonbetweenrandomforestmodelsutilisingradiomicandsemanticfeatures
AT makszemun noninvasiveclassificationofnonsmallcelllungcanceracomparisonbetweenrandomforestmodelsutilisingradiomicandsemanticfeatures
AT nairarjun noninvasiveclassificationofnonsmallcelllungcanceracomparisonbetweenrandomforestmodelsutilisingradiomicandsemanticfeatures
AT mcleanemma noninvasiveclassificationofnonsmallcelllungcanceracomparisonbetweenrandomforestmodelsutilisingradiomicandsemanticfeatures
AT billeandrea noninvasiveclassificationofnonsmallcelllungcanceracomparisonbetweenrandomforestmodelsutilisingradiomicandsemanticfeatures
AT gohvicky noninvasiveclassificationofnonsmallcelllungcanceracomparisonbetweenrandomforestmodelsutilisingradiomicandsemanticfeatures
AT cookgary noninvasiveclassificationofnonsmallcelllungcanceracomparisonbetweenrandomforestmodelsutilisingradiomicandsemanticfeatures