Cargando…

Performance and comparison of artificial intelligence and human experts in the detection and classification of colonic polyps

OBJECTIVE: The main aim of this study was to analyze the performance of different artificial intelligence (AI) models in endoscopic colonic polyp detection and classification and compare them with doctors with different experience. METHODS: We searched the studies on Colonoscopy, Colonic Polyps, Art...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Ming-De, Huang, Ze-Rong, Shan, Quan-Yuan, Chen, Shu-Ling, Zhang, Ning, Hu, Hang-Tong, Wang, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9749329/
https://www.ncbi.nlm.nih.gov/pubmed/36513975
http://dx.doi.org/10.1186/s12876-022-02605-2
Descripción
Sumario:OBJECTIVE: The main aim of this study was to analyze the performance of different artificial intelligence (AI) models in endoscopic colonic polyp detection and classification and compare them with doctors with different experience. METHODS: We searched the studies on Colonoscopy, Colonic Polyps, Artificial Intelligence, Machine Learning, and Deep Learning published before May 2020 in PubMed, EMBASE, Cochrane, and the citation index of the conference proceedings. The quality of studies was assessed using the QUADAS-2 table of diagnostic test quality evaluation criteria. The random-effects model was calculated using Meta-DISC 1.4 and RevMan 5.3. RESULTS: A total of 16 studies were included for meta-analysis. Only one study (1/16) presented externally validated results. The area under the curve (AUC) of AI group, expert group and non-expert group for detection and classification of colonic polyps were 0.940, 0.918, and 0.871, respectively. AI group had slightly lower pooled specificity than the expert group (79% vs. 86%, P < 0.05), but the pooled sensitivity was higher than the expert group (88% vs. 80%, P < 0.05). While the non-experts had less pooled specificity in polyp recognition than the experts (81% vs. 86%, P < 0.05), and higher pooled sensitivity than the experts (85% vs. 80%, P < 0.05). CONCLUSION: The performance of AI in polyp detection and classification is similar to that of human experts, with high sensitivity and moderate specificity. Different tasks may have an impact on the performance of deep learning models and human experts, especially in terms of sensitivity and specificity.