Cargando…

How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs

OBJECTIVES: To compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of eight popular symptom assessment apps. DESIGN: Vignettes study. SETTING: 200 primary care vignettes. INTERVENTION/COMPARATOR: For eight apps and seven general practitioners...

Descripción completa

Detalles Bibliográficos
Autores principales: Gilbert, Stephen, Mehl, Alicia, Baluch, Adel, Cawley, Caoimhe, Challiner, Jean, Fraser, Hamish, Millen, Elizabeth, Montazeri, Maryam, Multmeier, Jan, Pick, Fiona, Richter, Claudia, Türk, Ewelina, Upadhyay, Shubhanan, Virani, Vishaal, Vona, Nicola, Wicks, Paul, Novorol, Claire
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7745523/
https://www.ncbi.nlm.nih.gov/pubmed/33328258
http://dx.doi.org/10.1136/bmjopen-2020-040269
_version_ 1783624623939649536
author Gilbert, Stephen
Mehl, Alicia
Baluch, Adel
Cawley, Caoimhe
Challiner, Jean
Fraser, Hamish
Millen, Elizabeth
Montazeri, Maryam
Multmeier, Jan
Pick, Fiona
Richter, Claudia
Türk, Ewelina
Upadhyay, Shubhanan
Virani, Vishaal
Vona, Nicola
Wicks, Paul
Novorol, Claire
author_facet Gilbert, Stephen
Mehl, Alicia
Baluch, Adel
Cawley, Caoimhe
Challiner, Jean
Fraser, Hamish
Millen, Elizabeth
Montazeri, Maryam
Multmeier, Jan
Pick, Fiona
Richter, Claudia
Türk, Ewelina
Upadhyay, Shubhanan
Virani, Vishaal
Vona, Nicola
Wicks, Paul
Novorol, Claire
author_sort Gilbert, Stephen
collection PubMed
description OBJECTIVES: To compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of eight popular symptom assessment apps. DESIGN: Vignettes study. SETTING: 200 primary care vignettes. INTERVENTION/COMPARATOR: For eight apps and seven general practitioners (GPs): breadth of coverage and condition-suggestion and urgency advice accuracy measured against the vignettes’ gold-standard. PRIMARY OUTCOME MEASURES: (1) Proportion of conditions ‘covered’ by an app, that is, not excluded because the user was too young/old or pregnant, or not modelled; (2) proportion of vignettes with the correct primary diagnosis among the top 3 conditions suggested; (3) proportion of ‘safe’ urgency advice (ie, at gold standard level, more conservative, or no more than one level less conservative). RESULTS: Condition-suggestion coverage was highly variable, with some apps not offering a suggestion for many users: in alphabetical order, Ada: 99.0%; Babylon: 51.5%; Buoy: 88.5%; K Health: 74.5%; Mediktor: 80.5%; Symptomate: 61.5%; Your.MD: 64.5%; WebMD: 93.0%. Top-3 suggestion accuracy was GPs (average): 82.1%±5.2%; Ada: 70.5%; Babylon: 32.0%; Buoy: 43.0%; K Health: 36.0%; Mediktor: 36.0%; Symptomate: 27.5%; WebMD: 35.5%; Your.MD: 23.5%. Some apps excluded certain user demographics or conditions and their performance was generally greater with the exclusion of corresponding vignettes. For safe urgency advice, tested GPs had an average of 97.0%±2.5%. For the vignettes with advice provided, only three apps had safety performance within 1 SD of the GPs—Ada: 97.0%; Babylon: 95.1%; Symptomate: 97.8%. One app had a safety performance within 2 SDs of GPs—Your.MD: 92.6%. Three apps had a safety performance outside 2 SDs of GPs—Buoy: 80.0% (p<0.001); K Health: 81.3% (p<0.001); Mediktor: 87.3% (p=1.3×10(-3)). CONCLUSIONS: The utility of digital symptom assessment apps relies on coverage, accuracy and safety. While no digital tool outperformed GPs, some came close, and the nature of iterative improvements to software offers scalable improvements to care.
format Online
Article
Text
id pubmed-7745523
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-77455232020-12-28 How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs Gilbert, Stephen Mehl, Alicia Baluch, Adel Cawley, Caoimhe Challiner, Jean Fraser, Hamish Millen, Elizabeth Montazeri, Maryam Multmeier, Jan Pick, Fiona Richter, Claudia Türk, Ewelina Upadhyay, Shubhanan Virani, Vishaal Vona, Nicola Wicks, Paul Novorol, Claire BMJ Open Diagnostics OBJECTIVES: To compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of eight popular symptom assessment apps. DESIGN: Vignettes study. SETTING: 200 primary care vignettes. INTERVENTION/COMPARATOR: For eight apps and seven general practitioners (GPs): breadth of coverage and condition-suggestion and urgency advice accuracy measured against the vignettes’ gold-standard. PRIMARY OUTCOME MEASURES: (1) Proportion of conditions ‘covered’ by an app, that is, not excluded because the user was too young/old or pregnant, or not modelled; (2) proportion of vignettes with the correct primary diagnosis among the top 3 conditions suggested; (3) proportion of ‘safe’ urgency advice (ie, at gold standard level, more conservative, or no more than one level less conservative). RESULTS: Condition-suggestion coverage was highly variable, with some apps not offering a suggestion for many users: in alphabetical order, Ada: 99.0%; Babylon: 51.5%; Buoy: 88.5%; K Health: 74.5%; Mediktor: 80.5%; Symptomate: 61.5%; Your.MD: 64.5%; WebMD: 93.0%. Top-3 suggestion accuracy was GPs (average): 82.1%±5.2%; Ada: 70.5%; Babylon: 32.0%; Buoy: 43.0%; K Health: 36.0%; Mediktor: 36.0%; Symptomate: 27.5%; WebMD: 35.5%; Your.MD: 23.5%. Some apps excluded certain user demographics or conditions and their performance was generally greater with the exclusion of corresponding vignettes. For safe urgency advice, tested GPs had an average of 97.0%±2.5%. For the vignettes with advice provided, only three apps had safety performance within 1 SD of the GPs—Ada: 97.0%; Babylon: 95.1%; Symptomate: 97.8%. One app had a safety performance within 2 SDs of GPs—Your.MD: 92.6%. Three apps had a safety performance outside 2 SDs of GPs—Buoy: 80.0% (p<0.001); K Health: 81.3% (p<0.001); Mediktor: 87.3% (p=1.3×10(-3)). CONCLUSIONS: The utility of digital symptom assessment apps relies on coverage, accuracy and safety. While no digital tool outperformed GPs, some came close, and the nature of iterative improvements to software offers scalable improvements to care. BMJ Publishing Group 2020-12-16 /pmc/articles/PMC7745523/ /pubmed/33328258 http://dx.doi.org/10.1136/bmjopen-2020-040269 Text en © Author(s) (or their employer(s)) 2020. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. http://creativecommons.org/licenses/by-nc/4.0/ http://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Diagnostics
Gilbert, Stephen
Mehl, Alicia
Baluch, Adel
Cawley, Caoimhe
Challiner, Jean
Fraser, Hamish
Millen, Elizabeth
Montazeri, Maryam
Multmeier, Jan
Pick, Fiona
Richter, Claudia
Türk, Ewelina
Upadhyay, Shubhanan
Virani, Vishaal
Vona, Nicola
Wicks, Paul
Novorol, Claire
How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs
title How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs
title_full How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs
title_fullStr How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs
title_full_unstemmed How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs
title_short How accurate are digital symptom assessment apps for suggesting conditions and urgency advice? A clinical vignettes comparison to GPs
title_sort how accurate are digital symptom assessment apps for suggesting conditions and urgency advice? a clinical vignettes comparison to gps
topic Diagnostics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7745523/
https://www.ncbi.nlm.nih.gov/pubmed/33328258
http://dx.doi.org/10.1136/bmjopen-2020-040269
work_keys_str_mv AT gilbertstephen howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT mehlalicia howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT baluchadel howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT cawleycaoimhe howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT challinerjean howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT fraserhamish howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT millenelizabeth howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT montazerimaryam howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT multmeierjan howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT pickfiona howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT richterclaudia howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT turkewelina howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT upadhyayshubhanan howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT viranivishaal howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT vonanicola howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT wickspaul howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps
AT novorolclaire howaccuratearedigitalsymptomassessmentappsforsuggestingconditionsandurgencyadviceaclinicalvignettescomparisontogps