Cargando…

Preliminary Clinical Study of the Differences Between Interobserver Evaluation and Deep Convolutional Neural Network-Based Segmentation of Multiple Organs at Risk in CT Images of Lung Cancer

Background: In this study, publicly datasets with organs at risk (OAR) structures were used as reference data to compare the differences of several observers. Convolutional neural network (CNN)-based auto-contouring was also used in the analysis. We evaluated the variations among observers and the e...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Jinhan, Liu, Yimei, Zhang, Jun, Wang, Yixuan, Chen, Lixin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6624788/
https://www.ncbi.nlm.nih.gov/pubmed/31334129
http://dx.doi.org/10.3389/fonc.2019.00627
_version_ 1783434290615287808
author Zhu, Jinhan
Liu, Yimei
Zhang, Jun
Wang, Yixuan
Chen, Lixin
author_facet Zhu, Jinhan
Liu, Yimei
Zhang, Jun
Wang, Yixuan
Chen, Lixin
author_sort Zhu, Jinhan
collection PubMed
description Background: In this study, publicly datasets with organs at risk (OAR) structures were used as reference data to compare the differences of several observers. Convolutional neural network (CNN)-based auto-contouring was also used in the analysis. We evaluated the variations among observers and the effect of CNN-based auto-contouring in clinical applications. Materials and methods: A total of 60 publicly available lung cancer CT with structures were used; 48 cases were used for training, and the other 12 cases were used for testing. The structures of the datasets were used as reference data. Three observers and a CNN-based program performed contouring for 12 testing cases, and the 3D dice similarity coefficient (DSC) and mean surface distance (MSD) were used to evaluate differences from the reference data. The three observers edited the CNN-based contours, and the results were compared to those of manual contouring. A value of P<0.05 was considered statistically significant. Results: Compared to the reference data, no statistically significant differences were observed for the DSCs and MSDs among the manual contouring performed by the three observers at the same institution for the heart, esophagus, spinal cord, and left and right lungs. The 95% confidence interval (CI) and P-values of the CNN-based auto-contouring results comparing to the manual results for the heart, esophagus, spinal cord, and left and right lungs were as follows: the DSCs were CNN vs. A: 0.914~0.939(P = 0.004), 0.746~0.808(P = 0.002), 0.866~0.887(P = 0.136), 0.952~0.966(P = 0.158) and 0.960~0.972 (P = 0.136); CNN vs. B: 0.913~0.936 (P = 0.002), 0.745~0.807 (P = 0.005), 0.864~0.894 (P = 0.239), 0.952~0.964 (P = 0.308), and 0.959~0.971 (P = 0.272); and CNN vs. C: 0.912~0.933 (P = 0.004), 0.748~0.804(P = 0.002), 0.867~0.890 (P = 0.530), 0.952~0.964 (P = 0.308), and 0.958~0.970 (P = 0.480), respectively. The P-values of MSDs are similar to DSCs. The P-values of heart and esophagus is smaller than 0.05. No significant differences were found between the edited CNN-based auto-contouring results and the manual results. Conclusion: For the spinal cord, both lungs, no statistically significant differences were found between CNN-based auto-contouring and manual contouring. Further modifications to contouring of the heart and esophagus are necessary. Overall, editing based on CNN-based auto-contouring can effectively shorten the contouring time without affecting the results. CNNs have considerable potential for automatic contouring applications.
format Online
Article
Text
id pubmed-6624788
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-66247882019-07-22 Preliminary Clinical Study of the Differences Between Interobserver Evaluation and Deep Convolutional Neural Network-Based Segmentation of Multiple Organs at Risk in CT Images of Lung Cancer Zhu, Jinhan Liu, Yimei Zhang, Jun Wang, Yixuan Chen, Lixin Front Oncol Oncology Background: In this study, publicly datasets with organs at risk (OAR) structures were used as reference data to compare the differences of several observers. Convolutional neural network (CNN)-based auto-contouring was also used in the analysis. We evaluated the variations among observers and the effect of CNN-based auto-contouring in clinical applications. Materials and methods: A total of 60 publicly available lung cancer CT with structures were used; 48 cases were used for training, and the other 12 cases were used for testing. The structures of the datasets were used as reference data. Three observers and a CNN-based program performed contouring for 12 testing cases, and the 3D dice similarity coefficient (DSC) and mean surface distance (MSD) were used to evaluate differences from the reference data. The three observers edited the CNN-based contours, and the results were compared to those of manual contouring. A value of P<0.05 was considered statistically significant. Results: Compared to the reference data, no statistically significant differences were observed for the DSCs and MSDs among the manual contouring performed by the three observers at the same institution for the heart, esophagus, spinal cord, and left and right lungs. The 95% confidence interval (CI) and P-values of the CNN-based auto-contouring results comparing to the manual results for the heart, esophagus, spinal cord, and left and right lungs were as follows: the DSCs were CNN vs. A: 0.914~0.939(P = 0.004), 0.746~0.808(P = 0.002), 0.866~0.887(P = 0.136), 0.952~0.966(P = 0.158) and 0.960~0.972 (P = 0.136); CNN vs. B: 0.913~0.936 (P = 0.002), 0.745~0.807 (P = 0.005), 0.864~0.894 (P = 0.239), 0.952~0.964 (P = 0.308), and 0.959~0.971 (P = 0.272); and CNN vs. C: 0.912~0.933 (P = 0.004), 0.748~0.804(P = 0.002), 0.867~0.890 (P = 0.530), 0.952~0.964 (P = 0.308), and 0.958~0.970 (P = 0.480), respectively. The P-values of MSDs are similar to DSCs. The P-values of heart and esophagus is smaller than 0.05. No significant differences were found between the edited CNN-based auto-contouring results and the manual results. Conclusion: For the spinal cord, both lungs, no statistically significant differences were found between CNN-based auto-contouring and manual contouring. Further modifications to contouring of the heart and esophagus are necessary. Overall, editing based on CNN-based auto-contouring can effectively shorten the contouring time without affecting the results. CNNs have considerable potential for automatic contouring applications. Frontiers Media S.A. 2019-07-05 /pmc/articles/PMC6624788/ /pubmed/31334129 http://dx.doi.org/10.3389/fonc.2019.00627 Text en Copyright © 2019 Zhu, Liu, Zhang, Wang and Chen. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Oncology
Zhu, Jinhan
Liu, Yimei
Zhang, Jun
Wang, Yixuan
Chen, Lixin
Preliminary Clinical Study of the Differences Between Interobserver Evaluation and Deep Convolutional Neural Network-Based Segmentation of Multiple Organs at Risk in CT Images of Lung Cancer
title Preliminary Clinical Study of the Differences Between Interobserver Evaluation and Deep Convolutional Neural Network-Based Segmentation of Multiple Organs at Risk in CT Images of Lung Cancer
title_full Preliminary Clinical Study of the Differences Between Interobserver Evaluation and Deep Convolutional Neural Network-Based Segmentation of Multiple Organs at Risk in CT Images of Lung Cancer
title_fullStr Preliminary Clinical Study of the Differences Between Interobserver Evaluation and Deep Convolutional Neural Network-Based Segmentation of Multiple Organs at Risk in CT Images of Lung Cancer
title_full_unstemmed Preliminary Clinical Study of the Differences Between Interobserver Evaluation and Deep Convolutional Neural Network-Based Segmentation of Multiple Organs at Risk in CT Images of Lung Cancer
title_short Preliminary Clinical Study of the Differences Between Interobserver Evaluation and Deep Convolutional Neural Network-Based Segmentation of Multiple Organs at Risk in CT Images of Lung Cancer
title_sort preliminary clinical study of the differences between interobserver evaluation and deep convolutional neural network-based segmentation of multiple organs at risk in ct images of lung cancer
topic Oncology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6624788/
https://www.ncbi.nlm.nih.gov/pubmed/31334129
http://dx.doi.org/10.3389/fonc.2019.00627
work_keys_str_mv AT zhujinhan preliminaryclinicalstudyofthedifferencesbetweeninterobserverevaluationanddeepconvolutionalneuralnetworkbasedsegmentationofmultipleorgansatriskinctimagesoflungcancer
AT liuyimei preliminaryclinicalstudyofthedifferencesbetweeninterobserverevaluationanddeepconvolutionalneuralnetworkbasedsegmentationofmultipleorgansatriskinctimagesoflungcancer
AT zhangjun preliminaryclinicalstudyofthedifferencesbetweeninterobserverevaluationanddeepconvolutionalneuralnetworkbasedsegmentationofmultipleorgansatriskinctimagesoflungcancer
AT wangyixuan preliminaryclinicalstudyofthedifferencesbetweeninterobserverevaluationanddeepconvolutionalneuralnetworkbasedsegmentationofmultipleorgansatriskinctimagesoflungcancer
AT chenlixin preliminaryclinicalstudyofthedifferencesbetweeninterobserverevaluationanddeepconvolutionalneuralnetworkbasedsegmentationofmultipleorgansatriskinctimagesoflungcancer