Cargando…
E pluribus unum: prospective acceptability benchmarking from the Contouring Collaborative for Consensus in Radiation Oncology crowdsourced initiative for multiobserver segmentation
PURPOSE: Contouring Collaborative for Consensus in Radiation Oncology (C3RO) is a crowdsourced challenge engaging radiation oncologists across various expertise levels in segmentation. An obstacle to artificial intelligence (AI) development is the paucity of multiexpert datasets; consequently, we so...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Society of Photo-Optical Instrumentation Engineers
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9907021/ https://www.ncbi.nlm.nih.gov/pubmed/36761036 http://dx.doi.org/10.1117/1.JMI.10.S1.S11903 |
Sumario: | PURPOSE: Contouring Collaborative for Consensus in Radiation Oncology (C3RO) is a crowdsourced challenge engaging radiation oncologists across various expertise levels in segmentation. An obstacle to artificial intelligence (AI) development is the paucity of multiexpert datasets; consequently, we sought to characterize whether aggregate segmentations generated from multiple nonexperts could meet or exceed recognized expert agreement. APPROACH: Participants who contoured [Formula: see text] region of interest (ROI) for the breast, sarcoma, head and neck (H&N), gynecologic (GYN), or gastrointestinal (GI) cases were identified as a nonexpert or recognized expert. Cohort-specific ROIs were combined into single simultaneous truth and performance level estimation (STAPLE) consensus segmentations. [Formula: see text] ROIs were evaluated against [Formula: see text] contours using Dice similarity coefficient (DSC). The expert interobserver DSC ([Formula: see text]) was calculated as an acceptability threshold between [Formula: see text] and [Formula: see text]. To determine the number of nonexperts required to match the [Formula: see text] for each ROI, a single consensus contour was generated using variable numbers of nonexperts and then compared to the [Formula: see text]. RESULTS: For all cases, the DSC values for [Formula: see text] versus [Formula: see text] were higher than comparator expert [Formula: see text] for most ROIs. The minimum number of nonexpert segmentations needed for a consensus ROI to achieve [Formula: see text] acceptability criteria ranged between 2 and 4 for breast, 3 and 5 for sarcoma, 3 and 5 for H&N, 3 and 5 for GYN, and 3 for GI. CONCLUSIONS: Multiple nonexpert-generated consensus ROIs met or exceeded expert-derived acceptability thresholds. Five nonexperts could potentially generate consensus segmentations for most ROIs with performance approximating experts, suggesting nonexpert segmentations as feasible cost-effective AI inputs. |
---|