Cargando…

Rapid development of accurate artificial intelligence scoring for colitis disease activity using applied data science techniques

Background and study aims  Scoring endoscopic disease activity in colitis represents a complex task for artificial intelligence (AI), but is seen as a worthwhile goal for clinical and research use cases. To date, development attempts have relied on large datasets, achieving reasonable results when c...

Descripción completa

Detalles Bibliográficos
Autores principales: Patel, Mehul, Gulati, Shraddha, Iqbal, Fareed, Hayee, Bu'Hussain
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Georg Thieme Verlag KG 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9010092/
https://www.ncbi.nlm.nih.gov/pubmed/35433223
http://dx.doi.org/10.1055/a-1790-6201
Descripción
Sumario:Background and study aims  Scoring endoscopic disease activity in colitis represents a complex task for artificial intelligence (AI), but is seen as a worthwhile goal for clinical and research use cases. To date, development attempts have relied on large datasets, achieving reasonable results when comparing normal to active inflammation, but not when generating subscores for the Mayo Endoscopic Score (MES) or ulcerative colitis endoscopic index of severity (UCEIS). Patients and methods  Using a multi-task learning framework, with frame-by-frame analysis, we developed a machine-learning algorithm (MLA) for UCEIS trained on just 38,124 frames (73 patients with biopsy-proven ulcerative colitis). Scores generated by the MLA were compared to consensus scores from three independent human reviewers. Results  Accuracy and agreement (kappa) were calculated for the following differentiation tasks: (1) normal mucosa vs active inflammation (UCEIS 0 vs ≥ 1; accuracy 0.90, κ = 0.90); (2) mild inflammation vs moderate-severe (UCEIS 0–3 vs ≥ 4; accuracy 0.98, κ = 0.96); (3) generating total UCEIS score (κ = 0.92). Agreement for UCEIS subdomains was also high (κ = 0.80, 0.83 and 0.88 for vascular pattern, bleeding and erosions respectively). Conclusions  We have demonstrated that, using modified data science techniques and a relatively smaller datasets, it is possible to achieve high levels of accuracy and agreement with human reviewers (in some cases near-perfect), for AI in colitis scoring. Further work will focus on refining this technique, but we hope that it can be used in other tasks to facilitate faster development.