Cargando…
Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry
Introduction: The rising incidence of pediatric inflammatory bowel diseases (PIBD) facilitates the need for new methods of improving diagnosis latency, quality of care and documentation. Machine learning models have shown to be applicable to classifying PIBD when using histological data or extensive...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8180568/ https://www.ncbi.nlm.nih.gov/pubmed/34109197 http://dx.doi.org/10.3389/fmed.2021.666190 |
_version_ | 1783704018657214464 |
---|---|
author | Schneider, Nicolas Sohrabi, Keywan Schneider, Henning Zimmer, Klaus-Peter Fischer, Patrick de Laffolie, Jan |
author_facet | Schneider, Nicolas Sohrabi, Keywan Schneider, Henning Zimmer, Klaus-Peter Fischer, Patrick de Laffolie, Jan |
author_sort | Schneider, Nicolas |
collection | PubMed |
description | Introduction: The rising incidence of pediatric inflammatory bowel diseases (PIBD) facilitates the need for new methods of improving diagnosis latency, quality of care and documentation. Machine learning models have shown to be applicable to classifying PIBD when using histological data or extensive serology. This study aims to evaluate the performance of algorithms based on promptly available data more suited to clinical applications. Methods: Data of inflammatory locations of the bowels from initial and follow-up visitations is extracted from the CEDATA-GPGE registry and two follow-up sets are split off containing only input from 2017 and 2018. Pre-processing excludes patients in remission and encodes the categorical data numerically. For classification of PIBD diagnosis, a support vector machine (SVM), a random forest algorithm (RF), extreme gradient boosting (XGBoost), a dense neural network (DNN) and a convolutional neural network (CNN) are employed. As best performer, a convolutional neural network is further improved using grid optimization. Results: The achieved accuracy of the optimized neural network reaches up to 90.57% on data inserted into the registry in 2018. Less performant methods reach 88.78% for the DNN down to 83.94% for the XGBoost. The accuracy of prediction for the 2018 follow-up dataset is higher than those for older datasets. Neural networks yield a higher standard deviation with 3.45 for the CNN compared to 0.83–0.86 of the support vector machine and ensemble methods. Discussion: The displayed accuracy of the convolutional neural network proofs the viability of machine learning classification in PIBD diagnostics using only timely available data. |
format | Online Article Text |
id | pubmed-8180568 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-81805682021-06-08 Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry Schneider, Nicolas Sohrabi, Keywan Schneider, Henning Zimmer, Klaus-Peter Fischer, Patrick de Laffolie, Jan Front Med (Lausanne) Medicine Introduction: The rising incidence of pediatric inflammatory bowel diseases (PIBD) facilitates the need for new methods of improving diagnosis latency, quality of care and documentation. Machine learning models have shown to be applicable to classifying PIBD when using histological data or extensive serology. This study aims to evaluate the performance of algorithms based on promptly available data more suited to clinical applications. Methods: Data of inflammatory locations of the bowels from initial and follow-up visitations is extracted from the CEDATA-GPGE registry and two follow-up sets are split off containing only input from 2017 and 2018. Pre-processing excludes patients in remission and encodes the categorical data numerically. For classification of PIBD diagnosis, a support vector machine (SVM), a random forest algorithm (RF), extreme gradient boosting (XGBoost), a dense neural network (DNN) and a convolutional neural network (CNN) are employed. As best performer, a convolutional neural network is further improved using grid optimization. Results: The achieved accuracy of the optimized neural network reaches up to 90.57% on data inserted into the registry in 2018. Less performant methods reach 88.78% for the DNN down to 83.94% for the XGBoost. The accuracy of prediction for the 2018 follow-up dataset is higher than those for older datasets. Neural networks yield a higher standard deviation with 3.45 for the CNN compared to 0.83–0.86 of the support vector machine and ensemble methods. Discussion: The displayed accuracy of the convolutional neural network proofs the viability of machine learning classification in PIBD diagnostics using only timely available data. Frontiers Media S.A. 2021-05-24 /pmc/articles/PMC8180568/ /pubmed/34109197 http://dx.doi.org/10.3389/fmed.2021.666190 Text en Copyright © 2021 Schneider, Sohrabi, Schneider, Zimmer, Fischer, de Laffolie and CEDATA-GPGE Study Group. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Medicine Schneider, Nicolas Sohrabi, Keywan Schneider, Henning Zimmer, Klaus-Peter Fischer, Patrick de Laffolie, Jan Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry |
title | Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry |
title_full | Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry |
title_fullStr | Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry |
title_full_unstemmed | Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry |
title_short | Machine Learning Classification of Inflammatory Bowel Disease in Children Based on a Large Real-World Pediatric Cohort CEDATA-GPGE® Registry |
title_sort | machine learning classification of inflammatory bowel disease in children based on a large real-world pediatric cohort cedata-gpge® registry |
topic | Medicine |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8180568/ https://www.ncbi.nlm.nih.gov/pubmed/34109197 http://dx.doi.org/10.3389/fmed.2021.666190 |
work_keys_str_mv | AT schneidernicolas machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry AT sohrabikeywan machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry AT schneiderhenning machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry AT zimmerklauspeter machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry AT fischerpatrick machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry AT delaffoliejan machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry AT machinelearningclassificationofinflammatoryboweldiseaseinchildrenbasedonalargerealworldpediatriccohortcedatagpgeregistry |