Cargando…

The influence of explainable vs non-explainable clinical decision support systems on rapid triage decisions: a mixed methods study

BACKGROUND: During the COVID-19 pandemic, a variety of clinical decision support systems (CDSS) were developed to aid patient triage. However, research focusing on the interaction between decision support systems and human experts is lacking. METHODS: Thirty-two physicians were recruited to rate the...

Descripción completa

Detalles Bibliográficos
Autores principales: Laxar, Daniel, Eitenberger, Magdalena, Maleczek, Mathias, Kaider, Alexandra, Hammerle, Fabian Peter, Kimberger, Oliver
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10510231/
https://www.ncbi.nlm.nih.gov/pubmed/37726729
http://dx.doi.org/10.1186/s12916-023-03068-2
Descripción
Sumario:BACKGROUND: During the COVID-19 pandemic, a variety of clinical decision support systems (CDSS) were developed to aid patient triage. However, research focusing on the interaction between decision support systems and human experts is lacking. METHODS: Thirty-two physicians were recruited to rate the survival probability of 59 critically ill patients by means of chart review. Subsequently, one of two artificial intelligence systems advised the physician of a computed survival probability. However, only one of these systems explained the reasons behind its decision-making. In the third step, physicians reviewed the chart once again to determine the final survival probability rating. We hypothesized that an explaining system would exhibit a higher impact on the physicians’ second rating (i.e., higher weight-on-advice). RESULTS: The survival probability rating given by the physician after receiving advice from the clinical decision support system was a median of 4 percentage points closer to the advice than the initial rating. Weight-on-advice was not significantly different (p = 0.115) between the two systems (with vs without explanation for its decision). Additionally, weight-on-advice showed no difference according to time of day or between board-qualified and not yet board-qualified physicians. Self-reported post-experiment overall trust was awarded a median of 4 out of 10 points. When asked after the conclusion of the experiment, overall trust was 5.5/10 (non-explaining median 4 (IQR 3.5–5.5), explaining median 7 (IQR 5.5–7.5), p = 0.007). CONCLUSIONS: Although overall trust in the models was low, the median (IQR) weight-on-advice was high (0.33 (0.0–0.56)) and in line with published literature on expert advice. In contrast to the hypothesis, weight-on-advice was comparable between the explaining and non-explaining systems. In 30% of cases, weight-on-advice was 0, meaning the physician did not change their rating. The median of the remaining weight-on-advice values was 50%, suggesting that physicians either dismissed the recommendation or employed a “meeting halfway” approach. Newer technologies, such as clinical reasoning systems, may be able to augment the decision process rather than simply presenting unexplained bias. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12916-023-03068-2.