Cargando…

Are two readers more reliable than one? A study of upper neck ligament scoring on magnetic resonance images

BACKGROUND: Magnetic resonance imaging (MRI) studies typically employ either a single expert or multiple readers in collaboration to evaluate (read) the image results. However, no study has examined whether evaluations from multiple readers provide more reliable results than a single reader. We exam...

Descripción completa

Detalles Bibliográficos
Autores principales: Espeland, Ansgar, Vetti, Nils, Kråkenes, Jostein
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3626747/
https://www.ncbi.nlm.nih.gov/pubmed/23327567
http://dx.doi.org/10.1186/1471-2342-13-4
_version_ 1782266241547763712
author Espeland, Ansgar
Vetti, Nils
Kråkenes, Jostein
author_facet Espeland, Ansgar
Vetti, Nils
Kråkenes, Jostein
author_sort Espeland, Ansgar
collection PubMed
description BACKGROUND: Magnetic resonance imaging (MRI) studies typically employ either a single expert or multiple readers in collaboration to evaluate (read) the image results. However, no study has examined whether evaluations from multiple readers provide more reliable results than a single reader. We examined whether consistency in image interpretation by a single expert might be equal to the consistency of combined readings, defined as independent interpretations by two readers, where cases of disagreement were reconciled by consensus. METHODS: One expert neuroradiologist and one trained radiology resident independently evaluated 102 MRIs of the upper neck. The signal intensities of the alar and transverse ligaments were scored 0, 1, 2, or 3. Disagreements were resolved by consensus. They repeated the grading process after 3–8 months (second evaluation). We used kappa statistics and intraclass correlation coefficients (ICCs) to assess agreement between the initial and second evaluations for each radiologist and for combined determinations. Disagreements on score prevalence were evaluated with McNemar’s test. RESULTS: Higher consistency between the initial and second evaluations was obtained with the combined readings than with individual readings for signal intensity scores of ligaments on both the right and left sides of the spine. The weighted kappa ranges were 0.65-0.71 vs. 0.48-0.62 for combined vs. individual scoring, respectively. The combined scores also showed better agreement between evaluations than individual scores for the presence of grade 2–3 signal intensities on any side in a given subject (unweighted kappa 0.69-0.74 vs. 0.52-0.63, respectively). Disagreement between the initial and second evaluations on the prevalence of grades 2–3 was less marked for combined scores than for individual scores (P ≥ 0.039 vs. P ≤ 0.004, respectively). ICCs indicated a more reliable sum score per patient for combined scores (0.74) and both readers’ average scores (0.78) than for individual scores (0.55-0.69). CONCLUSIONS: This study was the first to provide empirical support for the principle that an additional reader can improve the reproducibility of MRI interpretations compared to one expert alone. Furthermore, even a moderately experienced second reader improved the reliability compared to a single expert reader. The implications of this for clinical work require further study.
format Online
Article
Text
id pubmed-3626747
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36267472013-04-16 Are two readers more reliable than one? A study of upper neck ligament scoring on magnetic resonance images Espeland, Ansgar Vetti, Nils Kråkenes, Jostein BMC Med Imaging Research Article BACKGROUND: Magnetic resonance imaging (MRI) studies typically employ either a single expert or multiple readers in collaboration to evaluate (read) the image results. However, no study has examined whether evaluations from multiple readers provide more reliable results than a single reader. We examined whether consistency in image interpretation by a single expert might be equal to the consistency of combined readings, defined as independent interpretations by two readers, where cases of disagreement were reconciled by consensus. METHODS: One expert neuroradiologist and one trained radiology resident independently evaluated 102 MRIs of the upper neck. The signal intensities of the alar and transverse ligaments were scored 0, 1, 2, or 3. Disagreements were resolved by consensus. They repeated the grading process after 3–8 months (second evaluation). We used kappa statistics and intraclass correlation coefficients (ICCs) to assess agreement between the initial and second evaluations for each radiologist and for combined determinations. Disagreements on score prevalence were evaluated with McNemar’s test. RESULTS: Higher consistency between the initial and second evaluations was obtained with the combined readings than with individual readings for signal intensity scores of ligaments on both the right and left sides of the spine. The weighted kappa ranges were 0.65-0.71 vs. 0.48-0.62 for combined vs. individual scoring, respectively. The combined scores also showed better agreement between evaluations than individual scores for the presence of grade 2–3 signal intensities on any side in a given subject (unweighted kappa 0.69-0.74 vs. 0.52-0.63, respectively). Disagreement between the initial and second evaluations on the prevalence of grades 2–3 was less marked for combined scores than for individual scores (P ≥ 0.039 vs. P ≤ 0.004, respectively). ICCs indicated a more reliable sum score per patient for combined scores (0.74) and both readers’ average scores (0.78) than for individual scores (0.55-0.69). CONCLUSIONS: This study was the first to provide empirical support for the principle that an additional reader can improve the reproducibility of MRI interpretations compared to one expert alone. Furthermore, even a moderately experienced second reader improved the reliability compared to a single expert reader. The implications of this for clinical work require further study. BioMed Central 2013-01-17 /pmc/articles/PMC3626747/ /pubmed/23327567 http://dx.doi.org/10.1186/1471-2342-13-4 Text en Copyright © 2013 Espeland et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Espeland, Ansgar
Vetti, Nils
Kråkenes, Jostein
Are two readers more reliable than one? A study of upper neck ligament scoring on magnetic resonance images
title Are two readers more reliable than one? A study of upper neck ligament scoring on magnetic resonance images
title_full Are two readers more reliable than one? A study of upper neck ligament scoring on magnetic resonance images
title_fullStr Are two readers more reliable than one? A study of upper neck ligament scoring on magnetic resonance images
title_full_unstemmed Are two readers more reliable than one? A study of upper neck ligament scoring on magnetic resonance images
title_short Are two readers more reliable than one? A study of upper neck ligament scoring on magnetic resonance images
title_sort are two readers more reliable than one? a study of upper neck ligament scoring on magnetic resonance images
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3626747/
https://www.ncbi.nlm.nih.gov/pubmed/23327567
http://dx.doi.org/10.1186/1471-2342-13-4
work_keys_str_mv AT espelandansgar aretworeadersmorereliablethanoneastudyofupperneckligamentscoringonmagneticresonanceimages
AT vettinils aretworeadersmorereliablethanoneastudyofupperneckligamentscoringonmagneticresonanceimages
AT krakenesjostein aretworeadersmorereliablethanoneastudyofupperneckligamentscoringonmagneticresonanceimages