Cargando…

Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data

Crohn’s disease (CD) and ulcerative colitis (UC) can be difficult to differentiate. As differential diagnosis is important in establishing a long-term treatment plan for patients, we aimed to develop a machine learning model for the differential diagnosis of the two diseases using RNA sequencing (RN...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Soo-Kyung, Kim, Sangsoo, Lee, Gi-Young, Kim, Sung-Yoon, Kim, Wan, Lee, Chil-Woo, Park, Jong-Lyul, Choi, Chang-Hwan, Kang, Sang-Bum, Kim, Tae-Oh, Bang, Ki-Bae, Chun, Jaeyoung, Cha, Jae-Myung, Im, Jong-Pil, Ahn, Kwang-Sung, Kim, Seon-Young, Park, Dong-Il
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8700628/
https://www.ncbi.nlm.nih.gov/pubmed/34943601
http://dx.doi.org/10.3390/diagnostics11122365
_version_ 1784620802944008192
author Park, Soo-Kyung
Kim, Sangsoo
Lee, Gi-Young
Kim, Sung-Yoon
Kim, Wan
Lee, Chil-Woo
Park, Jong-Lyul
Choi, Chang-Hwan
Kang, Sang-Bum
Kim, Tae-Oh
Bang, Ki-Bae
Chun, Jaeyoung
Cha, Jae-Myung
Im, Jong-Pil
Ahn, Kwang-Sung
Kim, Seon-Young
Park, Dong-Il
author_facet Park, Soo-Kyung
Kim, Sangsoo
Lee, Gi-Young
Kim, Sung-Yoon
Kim, Wan
Lee, Chil-Woo
Park, Jong-Lyul
Choi, Chang-Hwan
Kang, Sang-Bum
Kim, Tae-Oh
Bang, Ki-Bae
Chun, Jaeyoung
Cha, Jae-Myung
Im, Jong-Pil
Ahn, Kwang-Sung
Kim, Seon-Young
Park, Dong-Il
author_sort Park, Soo-Kyung
collection PubMed
description Crohn’s disease (CD) and ulcerative colitis (UC) can be difficult to differentiate. As differential diagnosis is important in establishing a long-term treatment plan for patients, we aimed to develop a machine learning model for the differential diagnosis of the two diseases using RNA sequencing (RNA-seq) data from endoscopic biopsy tissue from patients with inflammatory bowel disease (n = 127; CD, 94; UC, 33). Biopsy samples were taken from inflammatory lesions or normal tissues. The RNA-seq dataset was processed via mapping to the human reference genome (GRCh38) and quantifying the corresponding gene models that comprised 19,596 protein-coding genes. An unsupervised learning model showed distinct clusters of four classes: CD inflammatory, CD normal, UC inflammatory, and UC normal. A supervised learning model based on partial least squares discriminant analysis was able to distinguish inflammatory CD from inflammatory UC after pruning the strong classifiers of normal CD vs. normal UC. The error rate was minimal and affected only two components: 20 and 50 genes for the first and second components, respectively. The corresponding overall error rate was 0.147. RNA-seq analysis of tissue and the two components revealed in this study may be helpful for distinguishing CD from UC.
format Online
Article
Text
id pubmed-8700628
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-87006282021-12-24 Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data Park, Soo-Kyung Kim, Sangsoo Lee, Gi-Young Kim, Sung-Yoon Kim, Wan Lee, Chil-Woo Park, Jong-Lyul Choi, Chang-Hwan Kang, Sang-Bum Kim, Tae-Oh Bang, Ki-Bae Chun, Jaeyoung Cha, Jae-Myung Im, Jong-Pil Ahn, Kwang-Sung Kim, Seon-Young Park, Dong-Il Diagnostics (Basel) Article Crohn’s disease (CD) and ulcerative colitis (UC) can be difficult to differentiate. As differential diagnosis is important in establishing a long-term treatment plan for patients, we aimed to develop a machine learning model for the differential diagnosis of the two diseases using RNA sequencing (RNA-seq) data from endoscopic biopsy tissue from patients with inflammatory bowel disease (n = 127; CD, 94; UC, 33). Biopsy samples were taken from inflammatory lesions or normal tissues. The RNA-seq dataset was processed via mapping to the human reference genome (GRCh38) and quantifying the corresponding gene models that comprised 19,596 protein-coding genes. An unsupervised learning model showed distinct clusters of four classes: CD inflammatory, CD normal, UC inflammatory, and UC normal. A supervised learning model based on partial least squares discriminant analysis was able to distinguish inflammatory CD from inflammatory UC after pruning the strong classifiers of normal CD vs. normal UC. The error rate was minimal and affected only two components: 20 and 50 genes for the first and second components, respectively. The corresponding overall error rate was 0.147. RNA-seq analysis of tissue and the two components revealed in this study may be helpful for distinguishing CD from UC. MDPI 2021-12-15 /pmc/articles/PMC8700628/ /pubmed/34943601 http://dx.doi.org/10.3390/diagnostics11122365 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Park, Soo-Kyung
Kim, Sangsoo
Lee, Gi-Young
Kim, Sung-Yoon
Kim, Wan
Lee, Chil-Woo
Park, Jong-Lyul
Choi, Chang-Hwan
Kang, Sang-Bum
Kim, Tae-Oh
Bang, Ki-Bae
Chun, Jaeyoung
Cha, Jae-Myung
Im, Jong-Pil
Ahn, Kwang-Sung
Kim, Seon-Young
Park, Dong-Il
Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data
title Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data
title_full Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data
title_fullStr Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data
title_full_unstemmed Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data
title_short Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data
title_sort development of a machine learning model to distinguish between ulcerative colitis and crohn’s disease using rna sequencing data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8700628/
https://www.ncbi.nlm.nih.gov/pubmed/34943601
http://dx.doi.org/10.3390/diagnostics11122365
work_keys_str_mv AT parksookyung developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT kimsangsoo developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT leegiyoung developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT kimsungyoon developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT kimwan developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT leechilwoo developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT parkjonglyul developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT choichanghwan developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT kangsangbum developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT kimtaeoh developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT bangkibae developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT chunjaeyoung developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT chajaemyung developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT imjongpil developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT ahnkwangsung developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT kimseonyoung developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata
AT parkdongil developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata