Cargando…

Failure of a numerical quality assessment scale to identify potential risk of bias in a systematic review: a comparison study

BACKGROUND: Assessing methodological quality of primary studies is an essential component of systematic reviews. Following a systematic review which used a domain based system [United States Preventative Services Task Force (USPSTF)] to assess methodological quality, a commonly used numerical rating...

Descripción completa

Detalles Bibliográficos
Autores principales: O’Connor, Seán R, Tully, Mark A, Ryan, Brigid, Bradley, Judy M, Baxter, George D, McDonough, Suzanne M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4467625/
https://www.ncbi.nlm.nih.gov/pubmed/26048813
http://dx.doi.org/10.1186/s13104-015-1181-1
_version_ 1782376389213683712
author O’Connor, Seán R
Tully, Mark A
Ryan, Brigid
Bradley, Judy M
Baxter, George D
McDonough, Suzanne M
author_facet O’Connor, Seán R
Tully, Mark A
Ryan, Brigid
Bradley, Judy M
Baxter, George D
McDonough, Suzanne M
author_sort O’Connor, Seán R
collection PubMed
description BACKGROUND: Assessing methodological quality of primary studies is an essential component of systematic reviews. Following a systematic review which used a domain based system [United States Preventative Services Task Force (USPSTF)] to assess methodological quality, a commonly used numerical rating scale (Downs and Black) was also used to evaluate the included studies and comparisons were made between quality ratings assigned using the two different methods. Both tools were used to assess the 20 randomized and quasi-randomized controlled trials examining an exercise intervention for chronic musculoskeletal pain which were included in the review. Inter-rater reliability and levels of agreement were determined using intraclass correlation coefficients (ICC). Influence of quality on pooled effect size was examined by calculating the between group standardized mean difference (SMD). RESULTS: Inter-rater reliability indicated at least substantial levels of agreement for the USPSTF system (ICC 0.85; 95% CI 0.66, 0.94) and Downs and Black scale (ICC 0.94; 95% CI 0.84, 0.97). Overall level of agreement between tools (ICC 0.80; 95% CI 0.57, 0.92) was also good. However, the USPSTF system identified a number of studies (n = 3/20) as “poor” due to potential risks of bias. Analysis revealed substantially greater pooled effect sizes in these studies (SMD −2.51; 95% CI −4.21, −0.82) compared to those rated as “fair” (SMD −0.45; 95% CI −0.65, −0.25) or “good” (SMD −0.38; 95% CI −0.69, −0.08). CONCLUSIONS: In this example, use of a numerical rating scale failed to identify studies at increased risk of bias, and could have potentially led to imprecise estimates of treatment effect. Although based on a small number of included studies within an existing systematic review, we found the domain based system provided a more structured framework by which qualitative decisions concerning overall quality could be made, and was useful for detecting potential sources of bias in the available evidence. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13104-015-1181-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4467625
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44676252015-06-16 Failure of a numerical quality assessment scale to identify potential risk of bias in a systematic review: a comparison study O’Connor, Seán R Tully, Mark A Ryan, Brigid Bradley, Judy M Baxter, George D McDonough, Suzanne M BMC Res Notes Research Article BACKGROUND: Assessing methodological quality of primary studies is an essential component of systematic reviews. Following a systematic review which used a domain based system [United States Preventative Services Task Force (USPSTF)] to assess methodological quality, a commonly used numerical rating scale (Downs and Black) was also used to evaluate the included studies and comparisons were made between quality ratings assigned using the two different methods. Both tools were used to assess the 20 randomized and quasi-randomized controlled trials examining an exercise intervention for chronic musculoskeletal pain which were included in the review. Inter-rater reliability and levels of agreement were determined using intraclass correlation coefficients (ICC). Influence of quality on pooled effect size was examined by calculating the between group standardized mean difference (SMD). RESULTS: Inter-rater reliability indicated at least substantial levels of agreement for the USPSTF system (ICC 0.85; 95% CI 0.66, 0.94) and Downs and Black scale (ICC 0.94; 95% CI 0.84, 0.97). Overall level of agreement between tools (ICC 0.80; 95% CI 0.57, 0.92) was also good. However, the USPSTF system identified a number of studies (n = 3/20) as “poor” due to potential risks of bias. Analysis revealed substantially greater pooled effect sizes in these studies (SMD −2.51; 95% CI −4.21, −0.82) compared to those rated as “fair” (SMD −0.45; 95% CI −0.65, −0.25) or “good” (SMD −0.38; 95% CI −0.69, −0.08). CONCLUSIONS: In this example, use of a numerical rating scale failed to identify studies at increased risk of bias, and could have potentially led to imprecise estimates of treatment effect. Although based on a small number of included studies within an existing systematic review, we found the domain based system provided a more structured framework by which qualitative decisions concerning overall quality could be made, and was useful for detecting potential sources of bias in the available evidence. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13104-015-1181-1) contains supplementary material, which is available to authorized users. BioMed Central 2015-06-06 /pmc/articles/PMC4467625/ /pubmed/26048813 http://dx.doi.org/10.1186/s13104-015-1181-1 Text en © O'Connor et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
O’Connor, Seán R
Tully, Mark A
Ryan, Brigid
Bradley, Judy M
Baxter, George D
McDonough, Suzanne M
Failure of a numerical quality assessment scale to identify potential risk of bias in a systematic review: a comparison study
title Failure of a numerical quality assessment scale to identify potential risk of bias in a systematic review: a comparison study
title_full Failure of a numerical quality assessment scale to identify potential risk of bias in a systematic review: a comparison study
title_fullStr Failure of a numerical quality assessment scale to identify potential risk of bias in a systematic review: a comparison study
title_full_unstemmed Failure of a numerical quality assessment scale to identify potential risk of bias in a systematic review: a comparison study
title_short Failure of a numerical quality assessment scale to identify potential risk of bias in a systematic review: a comparison study
title_sort failure of a numerical quality assessment scale to identify potential risk of bias in a systematic review: a comparison study
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4467625/
https://www.ncbi.nlm.nih.gov/pubmed/26048813
http://dx.doi.org/10.1186/s13104-015-1181-1
work_keys_str_mv AT oconnorseanr failureofanumericalqualityassessmentscaletoidentifypotentialriskofbiasinasystematicreviewacomparisonstudy
AT tullymarka failureofanumericalqualityassessmentscaletoidentifypotentialriskofbiasinasystematicreviewacomparisonstudy
AT ryanbrigid failureofanumericalqualityassessmentscaletoidentifypotentialriskofbiasinasystematicreviewacomparisonstudy
AT bradleyjudym failureofanumericalqualityassessmentscaletoidentifypotentialriskofbiasinasystematicreviewacomparisonstudy
AT baxtergeorged failureofanumericalqualityassessmentscaletoidentifypotentialriskofbiasinasystematicreviewacomparisonstudy
AT mcdonoughsuzannem failureofanumericalqualityassessmentscaletoidentifypotentialriskofbiasinasystematicreviewacomparisonstudy