Cargando…

Computer-aided diagnosis of renal obstruction: utility of log-linear modeling versus standard ROC and kappa analysis

BACKGROUND: The accuracy of computer-aided diagnosis (CAD) software is best evaluated by comparison to a gold standard which represents the true status of disease. In many settings, however, knowledge of the true status of disease is not possible and accuracy is evaluated against the interpretations...

Descripción completa

Detalles Bibliográficos
Autores principales: Manatunga, Amita K, G Binongo, José Nilo, Taylor, Andrew T
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3175375/
https://www.ncbi.nlm.nih.gov/pubmed/21935501
http://dx.doi.org/10.1186/2191-219X-1-5
Descripción
Sumario:BACKGROUND: The accuracy of computer-aided diagnosis (CAD) software is best evaluated by comparison to a gold standard which represents the true status of disease. In many settings, however, knowledge of the true status of disease is not possible and accuracy is evaluated against the interpretations of an expert panel. Common statistical approaches to evaluate accuracy include receiver operating characteristic (ROC) and kappa analysis but both of these methods have significant limitations and cannot answer the question of equivalence: Is the CAD performance equivalent to that of an expert? The goal of this study is to show the strength of log-linear analysis over standard ROC and kappa statistics in evaluating the accuracy of computer-aided diagnosis of renal obstruction compared to the diagnosis provided by expert readers. METHODS: Log-linear modeling was utilized to analyze a previously published database that used ROC and kappa statistics to compare diuresis renography scan interpretations (non-obstructed, equivocal, or obstructed) generated by a renal expert system (RENEX) in 185 kidneys (95 patients) with the independent and consensus scan interpretations of three experts who were blinded to clinical information and prospectively and independently graded each kidney as obstructed, equivocal, or non-obstructed. RESULTS: Log-linear modeling showed that RENEX and the expert consensus had beyond-chance agreement in both non-obstructed and obstructed readings (both p < 0.0001). Moreover, pairwise agreement between experts and pairwise agreement between each expert and RENEX were not significantly different (p = 0.41, 0.95, 0.81 for the non-obstructed, equivocal, and obstructed categories, respectively). Similarly, the three-way agreement of the three experts and three-way agreement of two experts and RENEX was not significantly different for non-obstructed (p = 0.79) and obstructed (p = 0.49) categories. CONCLUSION: Log-linear modeling showed that RENEX was equivalent to any expert in rating kidneys, particularly in the obstructed and non-obstructed categories. This conclusion, which could not be derived from the original ROC and kappa analysis, emphasizes and illustrates the role and importance of log-linear modeling in the absence of a gold standard. The log-linear analysis also provides additional evidence that RENEX has the potential to assist in the interpretation of diuresis renography studies.