Cargando…

How machine-learning recommendations influence clinician treatment selections: the example of antidepressant selection

Decision support systems embodying machine learning models offer the promise of an improved standard of care for major depressive disorder, but little is known about how clinicians’ treatment decisions will be influenced by machine learning recommendations and explanations. We used a within-subject...

Descripción completa

Detalles Bibliográficos
Autores principales: Jacobs, Maia, Pradier, Melanie F., McCoy, Thomas H., Perlis, Roy H., Doshi-Velez, Finale, Gajos, Krzysztof Z.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7862671/
https://www.ncbi.nlm.nih.gov/pubmed/33542191
http://dx.doi.org/10.1038/s41398-021-01224-x
Descripción
Sumario:Decision support systems embodying machine learning models offer the promise of an improved standard of care for major depressive disorder, but little is known about how clinicians’ treatment decisions will be influenced by machine learning recommendations and explanations. We used a within-subject factorial experiment to present 220 clinicians with patient vignettes, each with or without a machine-learning (ML) recommendation and one of the multiple forms of explanation. We found that interacting with ML recommendations did not significantly improve clinicians’ treatment selection accuracy, assessed as concordance with expert psychopharmacologist consensus, compared to baseline scenarios in which clinicians made treatment decisions independently. Interacting with incorrect recommendations paired with explanations that included limited but easily interpretable information did lead to a significant reduction in treatment selection accuracy compared to baseline questions. These results suggest that incorrect ML recommendations may adversely impact clinician treatment selections and that explanations are insufficient for addressing overreliance on imperfect ML algorithms. More generally, our findings challenge the common assumption that clinicians interacting with ML tools will perform better than either clinicians or ML algorithms individually.