Cargando…

A121 QUANTIFYING INTER-OBSERVER VARIABILITY IN THE SEGMENTATION OF RECTAL TUMORS IN ENDOSCOPY IMAGES AND ITS EFFECTS ON DEEP LEARNING

BACKGROUND: Tumor delineation in endoscopy images is a crucial part of clinical diagnoses and treatment planning for rectal cancer patients. However, it is challenging to detect and adequately determine the size of tumors in these images, especially for inexperienced clinicians. This motivates the n...

Descripción completa

Detalles Bibliográficos
Autores principales: Weishaupt, L L, Vuong, T, Thibodeau-Antonacci, A, Garant, A, Singh, K S, Miller, C, Martin, A, Enger, S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8859391/
http://dx.doi.org/10.1093/jcag/gwab049.120
Descripción
Sumario:BACKGROUND: Tumor delineation in endoscopy images is a crucial part of clinical diagnoses and treatment planning for rectal cancer patients. However, it is challenging to detect and adequately determine the size of tumors in these images, especially for inexperienced clinicians. This motivates the need for a standardized, automated segmentation method. While deep learning has proven to be a powerful tool for medical image segmentation, it requires a large quantity of high-quality annotated training data. Since the annotation of endoscopy images is prone to high inter-observer variability, creating a robust unbiased deep learning model for this task is challenging. AIMS: To quantify the inter-observer variability in the manual segmentation of tumors in endoscopy images of rectal cancer patients and investigate an automated approach using deep learning. METHODS: Three gastrointestinal physicians and radiation oncologists (G1, G2, and G3) segmented 2833 endoscopy images into tumor and non-tumor regions. The whole image classifications and the pixelwise classifications into tumor and non-tumor were compared to quantify the inter-observer variability. Each manual annotator is from a different institution. Three different deep learning architectures (FCN32, U-Net, and SegNet) were trained on the binary contours created by G2. This naive approach investigates the effectiveness of neglecting any information about the uncertainty associated with the task of tumor delineation. Finally, segmentations from G2 and the deep learning models’ predictions were compared against ground truth labels from G1 and G3, and accuracy, sensitivity, specificity, precision, and F1 scores were computed for images where both segmentations contained tumors. RESULTS: The deep-learning segmentation took less than 1 second, while manual segmentation took approximately 10 seconds per image. There was significant inter-observer variability for the whole-image classifications made by the manual annotators (Figure 1A). The segmentation scores achieved by the deep learning models (SegNet F1:0.80±0.08) were comparable to the inter-observer variability for the pixel-wise image classification (Figure 1B). CONCLUSIONS: The large inter-observer variability observed in this study indicates a need for an automated segmentation tool for tumors in endoscopy images of rectal cancer patients. While deep learning models trained on a single observer’s labels can segment tumors with an accuracy similar to the inter-observer variability, these models do not accurately reflect the intrinsic uncertainty associated with tumor delineation. In our ongoing studies, we investigate training a model with all observers’ contours to reflect the uncertainty associated with the tumor segmentations. FUNDING AGENCIES: CIHRNSERC