Cargando…

Can natural language processing help differentiate inflammatory intestinal diseases in China? Models applying random forest and convolutional neural network approaches

BACKGROUND: Differentiating between ulcerative colitis (UC), Crohn’s disease (CD) and intestinal tuberculosis (ITB) using endoscopy is challenging. We aimed to realize automatic differential diagnosis among these diseases through machine learning algorithms. METHODS: A total of 6399 consecutive pati...

Descripción completa

Detalles Bibliográficos
Autores principales: Tong, Yuanren, Lu, Keming, Yang, Yingyun, Li, Ji, Lin, Yucong, Wu, Dong, Yang, Aiming, Li, Yue, Yu, Sheng, Qian, Jiaming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7526202/
https://www.ncbi.nlm.nih.gov/pubmed/32993636
http://dx.doi.org/10.1186/s12911-020-01277-w
Descripción
Sumario:BACKGROUND: Differentiating between ulcerative colitis (UC), Crohn’s disease (CD) and intestinal tuberculosis (ITB) using endoscopy is challenging. We aimed to realize automatic differential diagnosis among these diseases through machine learning algorithms. METHODS: A total of 6399 consecutive patients (5128 UC, 875 CD and 396 ITB) who had undergone colonoscopy examinations in the Peking Union Medical College Hospital from January 2008 to November 2018 were enrolled. The input was the description of the endoscopic image in the form of free text. Word segmentation and key word filtering were conducted as data preprocessing. Random forest (RF) and convolutional neural network (CNN) approaches were applied to different disease entities. Three two-class classifiers (UC and CD, UC and ITB, and CD and ITB) and a three-class classifier (UC, CD and ITB) were built. RESULTS: The classifiers built in this research performed well, and the CNN had better performance in general. The RF sensitivities/specificities of UC-CD, UC-ITB, and CD-ITB were 0.89/0.84, 0.83/0.82, and 0.72/0.77, respectively, while the values for the CNN of CD-ITB were 0.90/0.77. The precisions/recalls of UC-CD-ITB when employing RF were 0.97/0.97, 0.65/0.53, and 0.68/0.76, respectively, and when employing the CNN were 0.99/0.97, 0.87/0.83, and 0.52/0.81, respectively. CONCLUSIONS: Classifiers built by RF and CNN approaches had excellent performance when classifying UC with CD or ITB. For the differentiation of CD and ITB, high specificity and sensitivity were achieved as well. Artificial intelligence through machine learning is very promising in helping unexperienced endoscopists differentiate inflammatory intestinal diseases. CONFERENCE: The abstract of this article has won the first prize of the Young Investigator Award during the Asian Pacific Digestive Week (APDW) 2019 held in Kolkata, India.