Cargando…

Unexpected Actors in Inflammatory Bowel Disease Revealed by Machine Learning from Whole-Blood Transcriptomic Data

Although big data from transcriptomic analyses have helped transform our understanding of inflammatory bowel disease (IBD), they remain underexploited. We hypothesized that the application of machine learning using lasso regression to transcriptomic data from IBD patients and controls can help ident...

Descripción completa

Detalles Bibliográficos
Autores principales: Nowak, Jan K., Szymańska, Cyntia J., Glapa-Nowak, Aleksandra, Duclaux-Loras, Rémi, Dybska, Emilia, Ostrowski, Jerzy, Walkowiak, Jarosław, Adams, Alex T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9498489/
https://www.ncbi.nlm.nih.gov/pubmed/36140740
http://dx.doi.org/10.3390/genes13091570
_version_ 1784794771538051072
author Nowak, Jan K.
Szymańska, Cyntia J.
Glapa-Nowak, Aleksandra
Duclaux-Loras, Rémi
Dybska, Emilia
Ostrowski, Jerzy
Walkowiak, Jarosław
Adams, Alex T.
author_facet Nowak, Jan K.
Szymańska, Cyntia J.
Glapa-Nowak, Aleksandra
Duclaux-Loras, Rémi
Dybska, Emilia
Ostrowski, Jerzy
Walkowiak, Jarosław
Adams, Alex T.
author_sort Nowak, Jan K.
collection PubMed
description Although big data from transcriptomic analyses have helped transform our understanding of inflammatory bowel disease (IBD), they remain underexploited. We hypothesized that the application of machine learning using lasso regression to transcriptomic data from IBD patients and controls can help identify previously overlooked genes. Transcriptomic data provided by Ostrowski et al. (ENA PRJEB28822) were subjected to a two-stage process of feature selection to discriminate between IBD and controls. First, a principal component analysis was used for dimensionality reduction. Second, the least absolute shrinkage and selection operator (lasso) regression was employed to identify genes potentially involved in the pathobiology of IBD. The study included data from 294 participants: 100 with ulcerative colitis (48 adults and 52 children), 99 with Crohn’s disease (45 adults and 54 children), and 95 controls (46 adults and 49 children). IBD patients presented a wide range of disease severity. Lasso regression preceded by principal component analysis successfully selected interesting features in the IBD transcriptomic data and yielded 12 models. The models achieved high discriminatory value (range of the area under the receiver operating characteristic curve 0.61–0.95) and identified over 100 genes as potentially associated with IBD. PURA, GALNT14, and FCGR1A were the most consistently selected, highlighting the role of the cell cycle, glycosylation, and immunoglobulin binding. Several known IBD-related genes were among the results. The results included genes involved in the TGF-beta pathway, expressed in NK cells, and they were enriched in ontology terms related to immunity. Future IBD research should emphasize the TGF-beta pathway, immunoglobulins, NK cells, and the role of glycosylation.
format Online
Article
Text
id pubmed-9498489
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-94984892022-09-23 Unexpected Actors in Inflammatory Bowel Disease Revealed by Machine Learning from Whole-Blood Transcriptomic Data Nowak, Jan K. Szymańska, Cyntia J. Glapa-Nowak, Aleksandra Duclaux-Loras, Rémi Dybska, Emilia Ostrowski, Jerzy Walkowiak, Jarosław Adams, Alex T. Genes (Basel) Article Although big data from transcriptomic analyses have helped transform our understanding of inflammatory bowel disease (IBD), they remain underexploited. We hypothesized that the application of machine learning using lasso regression to transcriptomic data from IBD patients and controls can help identify previously overlooked genes. Transcriptomic data provided by Ostrowski et al. (ENA PRJEB28822) were subjected to a two-stage process of feature selection to discriminate between IBD and controls. First, a principal component analysis was used for dimensionality reduction. Second, the least absolute shrinkage and selection operator (lasso) regression was employed to identify genes potentially involved in the pathobiology of IBD. The study included data from 294 participants: 100 with ulcerative colitis (48 adults and 52 children), 99 with Crohn’s disease (45 adults and 54 children), and 95 controls (46 adults and 49 children). IBD patients presented a wide range of disease severity. Lasso regression preceded by principal component analysis successfully selected interesting features in the IBD transcriptomic data and yielded 12 models. The models achieved high discriminatory value (range of the area under the receiver operating characteristic curve 0.61–0.95) and identified over 100 genes as potentially associated with IBD. PURA, GALNT14, and FCGR1A were the most consistently selected, highlighting the role of the cell cycle, glycosylation, and immunoglobulin binding. Several known IBD-related genes were among the results. The results included genes involved in the TGF-beta pathway, expressed in NK cells, and they were enriched in ontology terms related to immunity. Future IBD research should emphasize the TGF-beta pathway, immunoglobulins, NK cells, and the role of glycosylation. MDPI 2022-09-01 /pmc/articles/PMC9498489/ /pubmed/36140740 http://dx.doi.org/10.3390/genes13091570 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Nowak, Jan K.
Szymańska, Cyntia J.
Glapa-Nowak, Aleksandra
Duclaux-Loras, Rémi
Dybska, Emilia
Ostrowski, Jerzy
Walkowiak, Jarosław
Adams, Alex T.
Unexpected Actors in Inflammatory Bowel Disease Revealed by Machine Learning from Whole-Blood Transcriptomic Data
title Unexpected Actors in Inflammatory Bowel Disease Revealed by Machine Learning from Whole-Blood Transcriptomic Data
title_full Unexpected Actors in Inflammatory Bowel Disease Revealed by Machine Learning from Whole-Blood Transcriptomic Data
title_fullStr Unexpected Actors in Inflammatory Bowel Disease Revealed by Machine Learning from Whole-Blood Transcriptomic Data
title_full_unstemmed Unexpected Actors in Inflammatory Bowel Disease Revealed by Machine Learning from Whole-Blood Transcriptomic Data
title_short Unexpected Actors in Inflammatory Bowel Disease Revealed by Machine Learning from Whole-Blood Transcriptomic Data
title_sort unexpected actors in inflammatory bowel disease revealed by machine learning from whole-blood transcriptomic data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9498489/
https://www.ncbi.nlm.nih.gov/pubmed/36140740
http://dx.doi.org/10.3390/genes13091570
work_keys_str_mv AT nowakjank unexpectedactorsininflammatoryboweldiseaserevealedbymachinelearningfromwholebloodtranscriptomicdata
AT szymanskacyntiaj unexpectedactorsininflammatoryboweldiseaserevealedbymachinelearningfromwholebloodtranscriptomicdata
AT glapanowakaleksandra unexpectedactorsininflammatoryboweldiseaserevealedbymachinelearningfromwholebloodtranscriptomicdata
AT duclauxlorasremi unexpectedactorsininflammatoryboweldiseaserevealedbymachinelearningfromwholebloodtranscriptomicdata
AT dybskaemilia unexpectedactorsininflammatoryboweldiseaserevealedbymachinelearningfromwholebloodtranscriptomicdata
AT ostrowskijerzy unexpectedactorsininflammatoryboweldiseaserevealedbymachinelearningfromwholebloodtranscriptomicdata
AT walkowiakjarosław unexpectedactorsininflammatoryboweldiseaserevealedbymachinelearningfromwholebloodtranscriptomicdata
AT adamsalext unexpectedactorsininflammatoryboweldiseaserevealedbymachinelearningfromwholebloodtranscriptomicdata