Cargando…

A reinforcement learning model for AI-based decision support in skin cancer

We investigated whether human preferences hold the potential to improve diagnostic artificial intelligence (AI)-based decision support using skin cancer diagnosis as a use case. We utilized nonuniform rewards and penalties based on expert-generated tables, balancing the benefits and harms of various...

Descripción completa

Detalles Bibliográficos
Autores principales: Barata, Catarina, Rotemberg, Veronica, Codella, Noel C. F., Tschandl, Philipp, Rinner, Christoph, Akay, Bengu Nisa, Apalla, Zoe, Argenziano, Giuseppe, Halpern, Allan, Lallas, Aimilios, Longo, Caterina, Malvehy, Josep, Puig, Susana, Rosendahl, Cliff, Soyer, H. Peter, Zalaudek, Iris, Kittler, Harald
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group US 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10427421/
https://www.ncbi.nlm.nih.gov/pubmed/37501017
http://dx.doi.org/10.1038/s41591-023-02475-5
Descripción
Sumario:We investigated whether human preferences hold the potential to improve diagnostic artificial intelligence (AI)-based decision support using skin cancer diagnosis as a use case. We utilized nonuniform rewards and penalties based on expert-generated tables, balancing the benefits and harms of various diagnostic errors, which were applied using reinforcement learning. Compared with supervised learning, the reinforcement learning model improved the sensitivity for melanoma from 61.4% to 79.5% (95% confidence interval (CI): 73.5–85.6%) and for basal cell carcinoma from 79.4% to 87.1% (95% CI: 80.3–93.9%). AI overconfidence was also reduced while simultaneously maintaining accuracy. Reinforcement learning increased the rate of correct diagnoses made by dermatologists by 12.0% (95% CI: 8.8–15.1%) and improved the rate of optimal management decisions from 57.4% to 65.3% (95% CI: 61.7–68.9%). We further demonstrated that the reward-adjusted reinforcement learning model and a threshold-based model outperformed naïve supervised learning in various clinical scenarios. Our findings suggest the potential for incorporating human preferences into image-based diagnostic algorithms.