Cargando…
On Clinical Agreement on the Visibility and Extent of Anatomical Layers in Digital Gonio Photographs
PURPOSE: To quantitatively evaluate the inter-annotator variability of clinicians tracing the contours of anatomical layers of the iridocorneal angle on digital gonio photographs, thus providing a baseline for the validation of automated analysis algorithms. METHODS: Using a software annotation tool...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Association for Research in Vision and Ophthalmology
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8419881/ https://www.ncbi.nlm.nih.gov/pubmed/34468695 http://dx.doi.org/10.1167/tvst.10.11.1 |
Sumario: | PURPOSE: To quantitatively evaluate the inter-annotator variability of clinicians tracing the contours of anatomical layers of the iridocorneal angle on digital gonio photographs, thus providing a baseline for the validation of automated analysis algorithms. METHODS: Using a software annotation tool on a common set of 20 images, five experienced ophthalmologists highlighted the contours of five anatomical layers of interest: iris root (IR), ciliary body band (CBB), scleral spur (SS), trabecular meshwork (TM), and cornea (C). Inter-annotator variability was assessed by (1) comparing the number of times ophthalmologists delineated each layer in the dataset; (2) quantifying how the consensus area for each layer (i.e., the intersection area of observers’ delineations) varied with the consensus threshold; and (3) calculating agreement among annotators using average per-layer precision, sensitivity, and Dice score. RESULTS: The SS showed the largest difference in annotation frequency (31%) and the minimum overall agreement in terms of consensus size (∼28% of the labeled pixels). The average annotator's per-layer statistics showed consistent patterns, with lower agreement on the CBB and SS (average Dice score ranges of 0.61–0.7 and 0.73–0.78, respectively) and better agreement on the IR, TM, and C (average Dice score ranges of 0.97–0.98, 0.84–0.9, and 0.93–0.96, respectively). CONCLUSIONS: There was considerable inter-annotator variation in identifying contours of some anatomical layers in digital gonio photographs. Our pilot indicates that agreement was best on IR, TM, and C but poorer for CBB and SS. TRANSLATIONAL RELEVANCE: This study provides a comprehensive description of inter-annotator agreement on digital gonio photographs segmentation as a baseline for validating deep learning models for automated gonioscopy. |
---|