Cargando…
Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography
BACKGROUND: Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography f...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4212964/ https://www.ncbi.nlm.nih.gov/pubmed/25353643 http://dx.doi.org/10.1371/journal.pone.0107633 |
_version_ | 1782341774778302464 |
---|---|
author | Mallett, Susan Halligan, Steve Collins, Gary S. Altman, Doug G. |
author_facet | Mallett, Susan Halligan, Steve Collins, Gary S. Altman, Doug G. |
author_sort | Mallett, Susan |
collection | PubMed |
description | BACKGROUND: Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. METHODS: In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. RESULTS: Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. CONCLUSIONS: The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests. |
format | Online Article Text |
id | pubmed-4212964 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-42129642014-11-05 Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography Mallett, Susan Halligan, Steve Collins, Gary S. Altman, Doug G. PLoS One Research Article BACKGROUND: Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. METHODS: In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. RESULTS: Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. CONCLUSIONS: The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests. Public Library of Science 2014-10-29 /pmc/articles/PMC4212964/ /pubmed/25353643 http://dx.doi.org/10.1371/journal.pone.0107633 Text en © 2014 Mallett et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Mallett, Susan Halligan, Steve Collins, Gary S. Altman, Doug G. Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography |
title | Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography |
title_full | Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography |
title_fullStr | Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography |
title_full_unstemmed | Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography |
title_short | Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography |
title_sort | exploration of analysis methods for diagnostic imaging tests: problems with roc auc and confidence scores in ct colonography |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4212964/ https://www.ncbi.nlm.nih.gov/pubmed/25353643 http://dx.doi.org/10.1371/journal.pone.0107633 |
work_keys_str_mv | AT mallettsusan explorationofanalysismethodsfordiagnosticimagingtestsproblemswithrocaucandconfidencescoresinctcolonography AT halligansteve explorationofanalysismethodsfordiagnosticimagingtestsproblemswithrocaucandconfidencescoresinctcolonography AT collinsgarys explorationofanalysismethodsfordiagnosticimagingtestsproblemswithrocaucandconfidencescoresinctcolonography AT altmandougg explorationofanalysismethodsfordiagnosticimagingtestsproblemswithrocaucandconfidencescoresinctcolonography |