Cargando…

Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry

Introduction: The rising incidence of pediatric inflammatory bowel diseases (PIBD) facilitates the need for new methods of improving diagnosis latency, quality of care and documentation. Machine learning models have shown to be applicable to classifying PIBD when using histological data or extensive...

Descripción completa

Detalles Bibliográficos
Autores principales: Schneider, Nicolas, Sohrabi, Keywan, Schneider, Henning, Zimmer, Klaus-Peter, Fischer, Patrick, de Laffolie, Jan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8180568/
https://www.ncbi.nlm.nih.gov/pubmed/34109197
http://dx.doi.org/10.3389/fmed.2021.666190
_version_ 1783704018657214464
author Schneider, Nicolas
Sohrabi, Keywan
Schneider, Henning
Zimmer, Klaus-Peter
Fischer, Patrick
de Laffolie, Jan
author_facet Schneider, Nicolas
Sohrabi, Keywan
Schneider, Henning
Zimmer, Klaus-Peter
Fischer, Patrick
de Laffolie, Jan
author_sort Schneider, Nicolas
collection PubMed
description Introduction: The rising incidence of pediatric inflammatory bowel diseases (PIBD) facilitates the need for new methods of improving diagnosis latency, quality of care and documentation. Machine learning models have shown to be applicable to classifying PIBD when using histological data or extensive serology. This study aims to evaluate the performance of algorithms based on promptly available data more suited to clinical applications. Methods: Data of inflammatory locations of the bowels from initial and follow-up visitations is extracted from the CEDATA-GPGE registry and two follow-up sets are split off containing only input from 2017 and 2018. Pre-processing excludes patients in remission and encodes the categorical data numerically. For classification of PIBD diagnosis, a support vector machine (SVM), a random forest algorithm (RF), extreme gradient boosting (XGBoost), a dense neural network (DNN) and a convolutional neural network (CNN) are employed. As best performer, a convolutional neural network is further improved using grid optimization. Results: The achieved accuracy of the optimized neural network reaches up to 90.57% on data inserted into the registry in 2018. Less performant methods reach 88.78% for the DNN down to 83.94% for the XGBoost. The accuracy of prediction for the 2018 follow-up dataset is higher than those for older datasets. Neural networks yield a higher standard deviation with 3.45 for the CNN compared to 0.83–0.86 of the support vector machine and ensemble methods. Discussion: The displayed accuracy of the convolutional neural network proofs the viability of machine learning classification in PIBD diagnostics using only timely available data.
format Online
Article
Text
id pubmed-8180568
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-81805682021-06-08 Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry Schneider, Nicolas Sohrabi, Keywan Schneider, Henning Zimmer, Klaus-Peter Fischer, Patrick de Laffolie, Jan Front Med (Lausanne) Medicine Introduction: The rising incidence of pediatric inflammatory bowel diseases (PIBD) facilitates the need for new methods of improving diagnosis latency, quality of care and documentation. Machine learning models have shown to be applicable to classifying PIBD when using histological data or extensive serology. This study aims to evaluate the performance of algorithms based on promptly available data more suited to clinical applications. Methods: Data of inflammatory locations of the bowels from initial and follow-up visitations is extracted from the CEDATA-GPGE registry and two follow-up sets are split off containing only input from 2017 and 2018. Pre-processing excludes patients in remission and encodes the categorical data numerically. For classification of PIBD diagnosis, a support vector machine (SVM), a random forest algorithm (RF), extreme gradient boosting (XGBoost), a dense neural network (DNN) and a convolutional neural network (CNN) are employed. As best performer, a convolutional neural network is further improved using grid optimization. Results: The achieved accuracy of the optimized neural network reaches up to 90.57% on data inserted into the registry in 2018. Less performant methods reach 88.78% for the DNN down to 83.94% for the XGBoost. The accuracy of prediction for the 2018 follow-up dataset is higher than those for older datasets. Neural networks yield a higher standard deviation with 3.45 for the CNN compared to 0.83–0.86 of the support vector machine and ensemble methods. Discussion: The displayed accuracy of the convolutional neural network proofs the viability of machine learning classification in PIBD diagnostics using only timely available data. Frontiers Media S.A. 2021-05-24 /pmc/articles/PMC8180568/ /pubmed/34109197 http://dx.doi.org/10.3389/fmed.2021.666190 Text en Copyright © 2021 Schneider, Sohrabi, Schneider, Zimmer, Fischer, de Laffolie and CEDATA-GPGE Study Group. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Medicine
Schneider, Nicolas
Sohrabi, Keywan
Schneider, Henning
Zimmer, Klaus-Peter
Fischer, Patrick
de Laffolie, Jan
Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry
title Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry
title_full Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry
title_fullStr Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry
title_full_unstemmed Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry
title_short Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry
title_sort machine learning classification of inflammatory bowel disease in children based on a large real-world pediatric cohort cedata-gpge® registry
topic Medicine
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8180568/
https://www.ncbi.nlm.nih.gov/pubmed/34109197
http://dx.doi.org/10.3389/fmed.2021.666190
work_keys_str_mv AT schneidernicolas machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry
AT sohrabikeywan machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry
AT schneiderhenning machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry
AT zimmerklauspeter machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry
AT fischerpatrick machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry
AT delaffoliejan machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry
AT machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry