Cargando…

E pluribus unum: prospective acceptability benchmarking from the Contouring Collaborative for Consensus in Radiation Oncology crowdsourced initiative for multiobserver segmentation

PURPOSE: Contouring Collaborative for Consensus in Radiation Oncology (C3RO) is a crowdsourced challenge engaging radiation oncologists across various expertise levels in segmentation. An obstacle to artificial intelligence (AI) development is the paucity of multiexpert datasets; consequently, we so...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Diana, Wahid, Kareem A., Nelms, Benjamin E., He, Renjie, Naser, Mohammed A., Duke, Simon, Sherer, Michael V., Christodouleas, John P., Mohamed, Abdallah S. R., Cislo, Michael, Murphy, James D., Fuller, Clifton D., Gillespie, Erin F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Society of Photo-Optical Instrumentation Engineers 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9907021/
https://www.ncbi.nlm.nih.gov/pubmed/36761036
http://dx.doi.org/10.1117/1.JMI.10.S1.S11903
Descripción
Sumario:PURPOSE: Contouring Collaborative for Consensus in Radiation Oncology (C3RO) is a crowdsourced challenge engaging radiation oncologists across various expertise levels in segmentation. An obstacle to artificial intelligence (AI) development is the paucity of multiexpert datasets; consequently, we sought to characterize whether aggregate segmentations generated from multiple nonexperts could meet or exceed recognized expert agreement. APPROACH: Participants who contoured [Formula: see text] region of interest (ROI) for the breast, sarcoma, head and neck (H&N), gynecologic (GYN), or gastrointestinal (GI) cases were identified as a nonexpert or recognized expert. Cohort-specific ROIs were combined into single simultaneous truth and performance level estimation (STAPLE) consensus segmentations. [Formula: see text] ROIs were evaluated against [Formula: see text] contours using Dice similarity coefficient (DSC). The expert interobserver DSC ([Formula: see text]) was calculated as an acceptability threshold between [Formula: see text] and [Formula: see text]. To determine the number of nonexperts required to match the [Formula: see text] for each ROI, a single consensus contour was generated using variable numbers of nonexperts and then compared to the [Formula: see text]. RESULTS: For all cases, the DSC values for [Formula: see text] versus [Formula: see text] were higher than comparator expert [Formula: see text] for most ROIs. The minimum number of nonexpert segmentations needed for a consensus ROI to achieve [Formula: see text] acceptability criteria ranged between 2 and 4 for breast, 3 and 5 for sarcoma, 3 and 5 for H&N, 3 and 5 for GYN, and 3 for GI. CONCLUSIONS: Multiple nonexpert-generated consensus ROIs met or exceeded expert-derived acceptability thresholds. Five nonexperts could potentially generate consensus segmentations for most ROIs with performance approximating experts, suggesting nonexpert segmentations as feasible cost-effective AI inputs.