Cargando…

A Human-Algorithm Integration System for Hip Fracture Detection on Plain Radiography: System Development and Validation Study

BACKGROUND: Hip fracture is the most common type of fracture in elderly individuals. Numerous deep learning (DL) algorithms for plain pelvic radiographs (PXRs) have been applied to improve the accuracy of hip fracture diagnosis. However, their efficacy is still undetermined. OBJECTIVE: The objective...

Descripción completa

Detalles Bibliográficos
Autores principales: Cheng, Chi-Tung, Chen, Chih-Chi, Cheng, Fu-Jen, Chen, Huan-Wu, Su, Yi-Siang, Yeh, Chun-Nan, Chung, I-Fang, Liao, Chien-Hung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7732715/
https://www.ncbi.nlm.nih.gov/pubmed/33245279
http://dx.doi.org/10.2196/19416
Descripción
Sumario:BACKGROUND: Hip fracture is the most common type of fracture in elderly individuals. Numerous deep learning (DL) algorithms for plain pelvic radiographs (PXRs) have been applied to improve the accuracy of hip fracture diagnosis. However, their efficacy is still undetermined. OBJECTIVE: The objective of this study is to develop and validate a human-algorithm integration (HAI) system to improve the accuracy of hip fracture diagnosis in a real clinical environment. METHODS: The HAI system with hip fracture detection ability was developed using a deep learning algorithm trained on trauma registry data and 3605 PXRs from August 2008 to December 2016. To compare their diagnostic performance before and after HAI system assistance using an independent testing dataset, 34 physicians were recruited. We analyzed the physicians’ accuracy, sensitivity, specificity, and agreement with the algorithm; we also performed subgroup analyses according to physician specialty and experience. Furthermore, we applied the HAI system in the emergency departments of different hospitals to validate its value in the real world. RESULTS: With the support of the algorithm, which achieved 91% accuracy, the diagnostic performance of physicians was significantly improved in the independent testing dataset, as was revealed by the sensitivity (physician alone, median 95%; HAI, median 99%; P<.001), specificity (physician alone, median 90%; HAI, median 95%; P<.001), accuracy (physician alone, median 90%; HAI, median 96%; P<.001), and human-algorithm agreement [physician alone κ, median 0.69 (IQR 0.63-0.74); HAI κ, median 0.80 (IQR 0.76-0.82); P<.001. With the help of the HAI system, the primary physicians showed significant improvement in their diagnostic performance to levels comparable to those of consulting physicians, and both the experienced and less-experienced physicians benefited from the HAI system. After the HAI system had been applied in 3 departments for 5 months, 587 images were examined. The sensitivity, specificity, and accuracy of the HAI system for detecting hip fractures were 97%, 95.7%, and 96.08%, respectively. CONCLUSIONS: HAI currently impacts health care, and integrating this technology into emergency departments is feasible. The developed HAI system can enhance physicians’ hip fracture diagnostic performance.