Cargando…

Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography

BACKGROUND: Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography f...

Descripción completa

Detalles Bibliográficos
Autores principales: Mallett, Susan, Halligan, Steve, Collins, Gary S., Altman, Doug G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4212964/
https://www.ncbi.nlm.nih.gov/pubmed/25353643
http://dx.doi.org/10.1371/journal.pone.0107633
_version_ 1782341774778302464
author Mallett, Susan
Halligan, Steve
Collins, Gary S.
Altman, Doug G.
author_facet Mallett, Susan
Halligan, Steve
Collins, Gary S.
Altman, Doug G.
author_sort Mallett, Susan
collection PubMed
description BACKGROUND: Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. METHODS: In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. RESULTS: Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. CONCLUSIONS: The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests.
format Online
Article
Text
id pubmed-4212964
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42129642014-11-05 Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography Mallett, Susan Halligan, Steve Collins, Gary S. Altman, Doug G. PLoS One Research Article BACKGROUND: Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. METHODS: In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. RESULTS: Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. CONCLUSIONS: The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests. Public Library of Science 2014-10-29 /pmc/articles/PMC4212964/ /pubmed/25353643 http://dx.doi.org/10.1371/journal.pone.0107633 Text en © 2014 Mallett et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Mallett, Susan
Halligan, Steve
Collins, Gary S.
Altman, Doug G.
Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography
title Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography
title_full Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography
title_fullStr Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography
title_full_unstemmed Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography
title_short Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography
title_sort exploration of analysis methods for diagnostic imaging tests: problems with roc auc and confidence scores in ct colonography
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4212964/
https://www.ncbi.nlm.nih.gov/pubmed/25353643
http://dx.doi.org/10.1371/journal.pone.0107633
work_keys_str_mv AT mallettsusan explorationofanalysismethodsfordiagnosticimagingtestsproblemswithrocaucandconfidencescoresinctcolonography
AT halligansteve explorationofanalysismethodsfordiagnosticimagingtestsproblemswithrocaucandconfidencescoresinctcolonography
AT collinsgarys explorationofanalysismethodsfordiagnosticimagingtestsproblemswithrocaucandconfidencescoresinctcolonography
AT altmandougg explorationofanalysismethodsfordiagnosticimagingtestsproblemswithrocaucandconfidencescoresinctcolonography