Cargando…
Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading
BACKGROUND: We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only ‘moderate’ agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2488396/ https://www.ncbi.nlm.nih.gov/pubmed/18698346 http://dx.doi.org/10.1371/journal.pone.0002925 |
_version_ | 1782158129563172864 |
---|---|
author | Fanshawe, Thomas R. Lynch, Andrew G. Ellis, Ian O. Green, Andrew R. Hanka, Rudolf |
author_facet | Fanshawe, Thomas R. Lynch, Andrew G. Ellis, Ian O. Green, Andrew R. Hanka, Rudolf |
author_sort | Fanshawe, Thomas R. |
collection | PubMed |
description | BACKGROUND: We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only ‘moderate’ agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting of 24177 grades, on a discrete 1–3 scale, provided by 732 pathologists for 52 samples. METHODOLOGY/PRINCIPAL FINDINGS: We review existing methods for analysing inter-rater agreement for multiple raters and demonstrate two further methods. Firstly, we examine a simple non-chance-corrected agreement score based on the observed proportion of agreements with the consensus for each sample, which makes no allowance for missing data. Secondly, treating grades as lying on a continuous scale representing tumour severity, we use a Bayesian latent trait method to model cumulative probabilities of assigning grade values as functions of the severity and clarity of the tumour and of rater-specific parameters representing boundaries between grades 1–2 and 2–3. We simulate from the fitted model to estimate, for each rater, the probability of agreement with the majority. Both methods suggest that there are differences between raters in terms of rating behaviour, most often caused by consistent over- or under-estimation of the grade boundaries, and also considerable variability in the distribution of grades assigned to many individual samples. The Bayesian model addresses the tendency of the agreement score to be biased upwards for raters who, by chance, see a relatively ‘easy’ set of samples. CONCLUSIONS/SIGNIFICANCE: Latent trait models can be adapted to provide novel information about the nature of inter-rater agreement when the number of raters is large and there are missing data. In this large study there is substantial variability between pathologists and uncertainty in the identity of the ‘true’ grade of many of the breast cancer tumours, a fact often ignored in clinical studies. |
format | Text |
id | pubmed-2488396 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-24883962008-08-13 Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading Fanshawe, Thomas R. Lynch, Andrew G. Ellis, Ian O. Green, Andrew R. Hanka, Rudolf PLoS One Research Article BACKGROUND: We consider the problem of assessing inter-rater agreement when there are missing data and a large number of raters. Previous studies have shown only ‘moderate’ agreement between pathologists in grading breast cancer tumour specimens. We analyse a large but incomplete data-set consisting of 24177 grades, on a discrete 1–3 scale, provided by 732 pathologists for 52 samples. METHODOLOGY/PRINCIPAL FINDINGS: We review existing methods for analysing inter-rater agreement for multiple raters and demonstrate two further methods. Firstly, we examine a simple non-chance-corrected agreement score based on the observed proportion of agreements with the consensus for each sample, which makes no allowance for missing data. Secondly, treating grades as lying on a continuous scale representing tumour severity, we use a Bayesian latent trait method to model cumulative probabilities of assigning grade values as functions of the severity and clarity of the tumour and of rater-specific parameters representing boundaries between grades 1–2 and 2–3. We simulate from the fitted model to estimate, for each rater, the probability of agreement with the majority. Both methods suggest that there are differences between raters in terms of rating behaviour, most often caused by consistent over- or under-estimation of the grade boundaries, and also considerable variability in the distribution of grades assigned to many individual samples. The Bayesian model addresses the tendency of the agreement score to be biased upwards for raters who, by chance, see a relatively ‘easy’ set of samples. CONCLUSIONS/SIGNIFICANCE: Latent trait models can be adapted to provide novel information about the nature of inter-rater agreement when the number of raters is large and there are missing data. In this large study there is substantial variability between pathologists and uncertainty in the identity of the ‘true’ grade of many of the breast cancer tumours, a fact often ignored in clinical studies. Public Library of Science 2008-08-13 /pmc/articles/PMC2488396/ /pubmed/18698346 http://dx.doi.org/10.1371/journal.pone.0002925 Text en Fanshawe et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Fanshawe, Thomas R. Lynch, Andrew G. Ellis, Ian O. Green, Andrew R. Hanka, Rudolf Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading |
title | Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading |
title_full | Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading |
title_fullStr | Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading |
title_full_unstemmed | Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading |
title_short | Assessing Agreement between Multiple Raters with Missing Rating Information, Applied to Breast Cancer Tumour Grading |
title_sort | assessing agreement between multiple raters with missing rating information, applied to breast cancer tumour grading |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2488396/ https://www.ncbi.nlm.nih.gov/pubmed/18698346 http://dx.doi.org/10.1371/journal.pone.0002925 |
work_keys_str_mv | AT fanshawethomasr assessingagreementbetweenmultipleraterswithmissingratinginformationappliedtobreastcancertumourgrading AT lynchandrewg assessingagreementbetweenmultipleraterswithmissingratinginformationappliedtobreastcancertumourgrading AT ellisiano assessingagreementbetweenmultipleraterswithmissingratinginformationappliedtobreastcancertumourgrading AT greenandrewr assessingagreementbetweenmultipleraterswithmissingratinginformationappliedtobreastcancertumourgrading AT hankarudolf assessingagreementbetweenmultipleraterswithmissingratinginformationappliedtobreastcancertumourgrading |