Cargando…

Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading

BACKGROUND: We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only ‘moderate’ agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting...

Descripción completa

Detalles Bibliográficos
Autores principales: Fanshawe, Thomas R., Lynch, Andrew G., Ellis, Ian O., Green, Andrew R., Hanka, Rudolf
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2488396/
https://www.ncbi.nlm.nih.gov/pubmed/18698346
http://dx.doi.org/10.1371/journal.pone.0002925
_version_ 1782158129563172864
author Fanshawe, Thomas R.
Lynch, Andrew G.
Ellis, Ian O.
Green, Andrew R.
Hanka, Rudolf
author_facet Fanshawe, Thomas R.
Lynch, Andrew G.
Ellis, Ian O.
Green, Andrew R.
Hanka, Rudolf
author_sort Fanshawe, Thomas R.
collection PubMed
description BACKGROUND: We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only ‘moderate’ agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting of 24177 grades, on a discrete 1–3 scale, provided by 732 pathologists for 52 samples. METHODOLOGY/PRINCIPAL FINDINGS: We review existing methods for analysing inter-rater agreement for multiple raters and demonstrate two further methods. Firstly, we examine a simple non-chance-corrected agreement score based on the observed proportion of agreements with the consensus for each sample, which makes no allowance for missing data. Secondly, treating grades as lying on a continuous scale representing tumour severity, we use a Bayesian latent trait method to model cumulative probabilities of assigning grade values as functions of the severity and clarity of the tumour and of rater-specific parameters representing boundaries between grades 1–2 and 2–3. We simulate from the fitted model to estimate, for each rater, the probability of agreement with the majority. Both methods suggest that there are differences between raters in terms of rating behaviour, most often caused by consistent over- or under-estimation of the grade boundaries, and also considerable variability in the distribution of grades assigned to many individual samples. The Bayesian model addresses the tendency of the agreement score to be biased upwards for raters who, by chance, see a relatively ‘easy’ set of samples. CONCLUSIONS/SIGNIFICANCE: Latent trait models can be adapted to provide novel information about the nature of inter-rater agreement when the number of raters is large and there are missing data. In this large study there is substantial variability between pathologists and uncertainty in the identity of the ‘true’ grade of many of the breast cancer tumours, a fact often ignored in clinical studies.
format Text
id pubmed-2488396
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-24883962008-08-13 Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading Fanshawe, Thomas R. Lynch, Andrew G. Ellis, Ian O. Green, Andrew R. Hanka, Rudolf PLoS One Research Article BACKGROUND: We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only ‘moderate’ agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting of 24177 grades, on a discrete 1–3 scale, provided by 732 pathologists for 52 samples. METHODOLOGY/PRINCIPAL FINDINGS: We review existing methods for analysing inter-rater agreement for multiple raters and demonstrate two further methods. Firstly, we examine a simple non-chance-corrected agreement score based on the observed proportion of agreements with the consensus for each sample, which makes no allowance for missing data. Secondly, treating grades as lying on a continuous scale representing tumour severity, we use a Bayesian latent trait method to model cumulative probabilities of assigning grade values as functions of the severity and clarity of the tumour and of rater-specific parameters representing boundaries between grades 1–2 and 2–3. We simulate from the fitted model to estimate, for each rater, the probability of agreement with the majority. Both methods suggest that there are differences between raters in terms of rating behaviour, most often caused by consistent over- or under-estimation of the grade boundaries, and also considerable variability in the distribution of grades assigned to many individual samples. The Bayesian model addresses the tendency of the agreement score to be biased upwards for raters who, by chance, see a relatively ‘easy’ set of samples. CONCLUSIONS/SIGNIFICANCE: Latent trait models can be adapted to provide novel information about the nature of inter-rater agreement when the number of raters is large and there are missing data. In this large study there is substantial variability between pathologists and uncertainty in the identity of the ‘true’ grade of many of the breast cancer tumours, a fact often ignored in clinical studies. Public Library of Science 2008-08-13 /pmc/articles/PMC2488396/ /pubmed/18698346 http://dx.doi.org/10.1371/journal.pone.0002925 Text en Fanshawe et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Fanshawe, Thomas R.
Lynch, Andrew G.
Ellis, Ian O.
Green, Andrew R.
Hanka, Rudolf
Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading
title Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading
title_full Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading
title_fullStr Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading
title_full_unstemmed Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading
title_short Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading
title_sort assessing agreement between multiple raters with missing rating information, applied to breast cancer tumour grading
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2488396/
https://www.ncbi.nlm.nih.gov/pubmed/18698346
http://dx.doi.org/10.1371/journal.pone.0002925
work_keys_str_mv AT fanshawethomasr assessingagreementbetweenmultipleraterswithmissingratinginformationappliedtobreastcancertumourgrading
AT lynchandrewg assessingagreementbetweenmultipleraterswithmissingratinginformationappliedtobreastcancertumourgrading
AT ellisiano assessingagreementbetweenmultipleraterswithmissingratinginformationappliedtobreastcancertumourgrading
AT greenandrewr assessingagreementbetweenmultipleraterswithmissingratinginformationappliedtobreastcancertumourgrading
AT hankarudolf assessingagreementbetweenmultipleraterswithmissingratinginformationappliedtobreastcancertumourgrading