Cargando…

Label-set impact on deep learning-based prostate segmentation on MRI

BACKGROUND: Prostate segmentation is an essential step in computer-aided detection and diagnosis systems for prostate cancer. Deep learning (DL)-based methods provide good performance for prostate gland and zones segmentation, but little is known about the impact of manual segmentation (that is, lab...

Descripción completa

Detalles Bibliográficos
Autores principales: Meglič, Jakob, Sunoqrot, Mohammed R. S., Bathen, Tone Frost, Elschot, Mattijs
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Vienna 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10519913/
https://www.ncbi.nlm.nih.gov/pubmed/37749333
http://dx.doi.org/10.1186/s13244-023-01502-w
Descripción
Sumario:BACKGROUND: Prostate segmentation is an essential step in computer-aided detection and diagnosis systems for prostate cancer. Deep learning (DL)-based methods provide good performance for prostate gland and zones segmentation, but little is known about the impact of manual segmentation (that is, label) selection on their performance. In this work, we investigated these effects by obtaining two different expert label-sets for the PROSTATEx I challenge training dataset (n = 198) and using them, in addition to an in-house dataset (n = 233), to assess the effect on segmentation performance. The automatic segmentation method we used was nnU-Net. RESULTS: The selection of training/testing label-set had a significant (p < 0.001) impact on model performance. Furthermore, it was found that model performance was significantly (p < 0.001) higher when the model was trained and tested with the same label-set. Moreover, the results showed that agreement between automatic segmentations was significantly (p < 0.0001) higher than agreement between manual segmentations and that the models were able to outperform the human label-sets used to train them. CONCLUSIONS: We investigated the impact of label-set selection on the performance of a DL-based prostate segmentation model. We found that the use of different sets of manual prostate gland and zone segmentations has a measurable impact on model performance. Nevertheless, DL-based segmentation appeared to have a greater inter-reader agreement than manual segmentation. More thought should be given to the label-set, with a focus on multicenter manual segmentation and agreement on common procedures. CRITICAL RELEVANCE STATEMENT: Label-set selection significantly impacts the performance of a deep learning-based prostate segmentation model. Models using different label-set showed higher agreement than manual segmentations. KEY POINTS: • Label-set selection has a significant impact on the performance of automatic segmentation models. • Deep learning-based models demonstrated true learning rather than simply mimicking the label-set. • Automatic segmentation appears to have a greater inter-reader agreement than manual segmentation. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13244-023-01502-w.