Musculoskeletal radiologist-level performance by using deep learning for detection of scaphoid fractures on conventional multi-view radiographs of hand and wrist

OBJECTIVES: To assess how an artificial intelligence (AI) algorithm performs against five experienced musculoskeletal radiologists in diagnosing scaphoid fractures and whether it aids their diagnosis on conventional multi-view radiographs. METHODS: Four datasets of conventional hand, wrist, and scap...

Descripción completa

Detalles Bibliográficos
Autores principales: Hendrix, Nils, Hendrix, Ward, van Dijke, Kees, Maresch, Bas, Maas, Mario, Bollen, Stijn, Scholtens, Alexander, de Jonge, Milko, Ong, Lee-Ling Sharon, van Ginneken, Bram, Rutten, Matthieu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Berlin Heidelberg 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9935716/
https://www.ncbi.nlm.nih.gov/pubmed/36380195
http://dx.doi.org/10.1007/s00330-022-09205-4
Descripción
Sumario:OBJECTIVES: To assess how an artificial intelligence (AI) algorithm performs against five experienced musculoskeletal radiologists in diagnosing scaphoid fractures and whether it aids their diagnosis on conventional multi-view radiographs. METHODS: Four datasets of conventional hand, wrist, and scaphoid radiographs were retrospectively acquired at two hospitals (hospitals A and B). Dataset 1 (12,990 radiographs from 3353 patients, hospital A) and dataset 2 (1117 radiographs from 394 patients, hospital B) were used for training and testing a scaphoid localization and laterality classification component. Dataset 3 (4316 radiographs from 840 patients, hospital A) and dataset 4 (688 radiographs from 209 patients, hospital B) were used for training and testing the fracture detector. The algorithm was compared with the radiologists in an observer study. Evaluation metrics included sensitivity, specificity, positive predictive value (PPV), area under the characteristic operating curve (AUC), Cohen’s kappa coefficient (κ), fracture localization precision, and reading time. RESULTS: The algorithm detected scaphoid fractures with a sensitivity of 72%, specificity of 93%, PPV of 81%, and AUC of 0.88. The AUC of the algorithm did not differ from each radiologist (0.87 [radiologists’ mean], p ≥ .05). AI assistance improved five out of ten pairs of inter-observer Cohen’s κ agreements (p < .05) and reduced reading time in four radiologists (p < .001), but did not improve other metrics in the majority of radiologists (p ≥ .05). CONCLUSIONS: The AI algorithm detects scaphoid fractures on conventional multi-view radiographs at the level of five experienced musculoskeletal radiologists and could significantly shorten their reading time. KEY POINTS: • An artificial intelligence algorithm automatically detects scaphoid fractures on conventional multi-view radiographs at the same level of five experienced musculoskeletal radiologists. • There is preliminary evidence that automated scaphoid fracture detection can significantly shorten the reading time of musculoskeletal radiologists. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00330-022-09205-4.