Cargando…

Multi-Reader Multi-Case Studies Using the Area under the Receiver Operator Characteristic Curve as a Measure of Diagnostic Accuracy: Systematic Review with a Focus on Quality of Data Reporting

INTRODUCTION: We examined the design, analysis and reporting in multi-reader multi-case (MRMC) research studies using the area under the receiver-operating curve (ROC AUC) as a measure of diagnostic performance. METHODS: We performed a systematic literature review from 2005 to 2013 inclusive to iden...

Descripción completa

Detalles Bibliográficos
Autores principales: Dendumrongsup, Thaworn, Plumb, Andrew A., Halligan, Steve, Fanshawe, Thomas R., Altman, Douglas G., Mallett, Susan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4277459/
https://www.ncbi.nlm.nih.gov/pubmed/25541977
http://dx.doi.org/10.1371/journal.pone.0116018
_version_ 1782350401494843392
author Dendumrongsup, Thaworn
Plumb, Andrew A.
Halligan, Steve
Fanshawe, Thomas R.
Altman, Douglas G.
Mallett, Susan
author_facet Dendumrongsup, Thaworn
Plumb, Andrew A.
Halligan, Steve
Fanshawe, Thomas R.
Altman, Douglas G.
Mallett, Susan
author_sort Dendumrongsup, Thaworn
collection PubMed
description INTRODUCTION: We examined the design, analysis and reporting in multi-reader multi-case (MRMC) research studies using the area under the receiver-operating curve (ROC AUC) as a measure of diagnostic performance. METHODS: We performed a systematic literature review from 2005 to 2013 inclusive to identify a minimum 50 studies. Articles of diagnostic test accuracy in humans were identified via their citation of key methodological articles dealing with MRMC ROC AUC. Two researchers in consensus then extracted information from primary articles relating to study characteristics and design, methods for reporting study outcomes, model fitting, model assumptions, presentation of results, and interpretation of findings. Results were summarized and presented with a descriptive analysis. RESULTS: Sixty-four full papers were retrieved from 475 identified citations and ultimately 49 articles describing 51 studies were reviewed and extracted. Radiological imaging was the index test in all. Most studies focused on lesion detection vs. characterization and used less than 10 readers. Only 6 (12%) studies trained readers in advance to use the confidence scale used to build the ROC curve. Overall, description of confidence scores, the ROC curve and its analysis was often incomplete. For example, 21 (41%) studies presented no ROC curve and only 3 (6%) described the distribution of confidence scores. Of 30 studies presenting curves, only 4 (13%) presented the data points underlying the curve, thereby allowing assessment of extrapolation. The mean change in AUC was 0.05 (−0.05 to 0.28). Non-significant change in AUC was attributed to underpowering rather than the diagnostic test failing to improve diagnostic accuracy. CONCLUSIONS: Data reporting in MRMC studies using ROC AUC as an outcome measure is frequently incomplete, hampering understanding of methods and the reliability of results and study conclusions. Authors using this analysis should be encouraged to provide a full description of their methods and results.
format Online
Article
Text
id pubmed-4277459
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42774592014-12-31 Multi-Reader Multi-Case Studies Using the Area under the Receiver Operator Characteristic Curve as a Measure of Diagnostic Accuracy: Systematic Review with a Focus on Quality of Data Reporting Dendumrongsup, Thaworn Plumb, Andrew A. Halligan, Steve Fanshawe, Thomas R. Altman, Douglas G. Mallett, Susan PLoS One Research Article INTRODUCTION: We examined the design, analysis and reporting in multi-reader multi-case (MRMC) research studies using the area under the receiver-operating curve (ROC AUC) as a measure of diagnostic performance. METHODS: We performed a systematic literature review from 2005 to 2013 inclusive to identify a minimum 50 studies. Articles of diagnostic test accuracy in humans were identified via their citation of key methodological articles dealing with MRMC ROC AUC. Two researchers in consensus then extracted information from primary articles relating to study characteristics and design, methods for reporting study outcomes, model fitting, model assumptions, presentation of results, and interpretation of findings. Results were summarized and presented with a descriptive analysis. RESULTS: Sixty-four full papers were retrieved from 475 identified citations and ultimately 49 articles describing 51 studies were reviewed and extracted. Radiological imaging was the index test in all. Most studies focused on lesion detection vs. characterization and used less than 10 readers. Only 6 (12%) studies trained readers in advance to use the confidence scale used to build the ROC curve. Overall, description of confidence scores, the ROC curve and its analysis was often incomplete. For example, 21 (41%) studies presented no ROC curve and only 3 (6%) described the distribution of confidence scores. Of 30 studies presenting curves, only 4 (13%) presented the data points underlying the curve, thereby allowing assessment of extrapolation. The mean change in AUC was 0.05 (−0.05 to 0.28). Non-significant change in AUC was attributed to underpowering rather than the diagnostic test failing to improve diagnostic accuracy. CONCLUSIONS: Data reporting in MRMC studies using ROC AUC as an outcome measure is frequently incomplete, hampering understanding of methods and the reliability of results and study conclusions. Authors using this analysis should be encouraged to provide a full description of their methods and results. Public Library of Science 2014-12-26 /pmc/articles/PMC4277459/ /pubmed/25541977 http://dx.doi.org/10.1371/journal.pone.0116018 Text en © 2014 Dendumrongsup et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Dendumrongsup, Thaworn
Plumb, Andrew A.
Halligan, Steve
Fanshawe, Thomas R.
Altman, Douglas G.
Mallett, Susan
Multi-Reader Multi-Case Studies Using the Area under the Receiver Operator Characteristic Curve as a Measure of Diagnostic Accuracy: Systematic Review with a Focus on Quality of Data Reporting
title Multi-Reader Multi-Case Studies Using the Area under the Receiver Operator Characteristic Curve as a Measure of Diagnostic Accuracy: Systematic Review with a Focus on Quality of Data Reporting
title_full Multi-Reader Multi-Case Studies Using the Area under the Receiver Operator Characteristic Curve as a Measure of Diagnostic Accuracy: Systematic Review with a Focus on Quality of Data Reporting
title_fullStr Multi-Reader Multi-Case Studies Using the Area under the Receiver Operator Characteristic Curve as a Measure of Diagnostic Accuracy: Systematic Review with a Focus on Quality of Data Reporting
title_full_unstemmed Multi-Reader Multi-Case Studies Using the Area under the Receiver Operator Characteristic Curve as a Measure of Diagnostic Accuracy: Systematic Review with a Focus on Quality of Data Reporting
title_short Multi-Reader Multi-Case Studies Using the Area under the Receiver Operator Characteristic Curve as a Measure of Diagnostic Accuracy: Systematic Review with a Focus on Quality of Data Reporting
title_sort multi-reader multi-case studies using the area under the receiver operator characteristic curve as a measure of diagnostic accuracy: systematic review with a focus on quality of data reporting
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4277459/
https://www.ncbi.nlm.nih.gov/pubmed/25541977
http://dx.doi.org/10.1371/journal.pone.0116018
work_keys_str_mv AT dendumrongsupthaworn multireadermulticasestudiesusingtheareaunderthereceiveroperatorcharacteristiccurveasameasureofdiagnosticaccuracysystematicreviewwithafocusonqualityofdatareporting
AT plumbandrewa multireadermulticasestudiesusingtheareaunderthereceiveroperatorcharacteristiccurveasameasureofdiagnosticaccuracysystematicreviewwithafocusonqualityofdatareporting
AT halligansteve multireadermulticasestudiesusingtheareaunderthereceiveroperatorcharacteristiccurveasameasureofdiagnosticaccuracysystematicreviewwithafocusonqualityofdatareporting
AT fanshawethomasr multireadermulticasestudiesusingtheareaunderthereceiveroperatorcharacteristiccurveasameasureofdiagnosticaccuracysystematicreviewwithafocusonqualityofdatareporting
AT altmandouglasg multireadermulticasestudiesusingtheareaunderthereceiveroperatorcharacteristiccurveasameasureofdiagnosticaccuracysystematicreviewwithafocusonqualityofdatareporting
AT mallettsusan multireadermulticasestudiesusingtheareaunderthereceiveroperatorcharacteristiccurveasameasureofdiagnosticaccuracysystematicreviewwithafocusonqualityofdatareporting