Cargando…
Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data
Crohn’s disease (CD) and ulcerative colitis (UC) can be difficult to differentiate. As differential diagnosis is important in establishing a long-term treatment plan for patients, we aimed to develop a machine learning model for the differential diagnosis of the two diseases using RNA sequencing (RN...
Autores principales: | , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8700628/ https://www.ncbi.nlm.nih.gov/pubmed/34943601 http://dx.doi.org/10.3390/diagnostics11122365 |
_version_ | 1784620802944008192 |
---|---|
author | Park, Soo-Kyung Kim, Sangsoo Lee, Gi-Young Kim, Sung-Yoon Kim, Wan Lee, Chil-Woo Park, Jong-Lyul Choi, Chang-Hwan Kang, Sang-Bum Kim, Tae-Oh Bang, Ki-Bae Chun, Jaeyoung Cha, Jae-Myung Im, Jong-Pil Ahn, Kwang-Sung Kim, Seon-Young Park, Dong-Il |
author_facet | Park, Soo-Kyung Kim, Sangsoo Lee, Gi-Young Kim, Sung-Yoon Kim, Wan Lee, Chil-Woo Park, Jong-Lyul Choi, Chang-Hwan Kang, Sang-Bum Kim, Tae-Oh Bang, Ki-Bae Chun, Jaeyoung Cha, Jae-Myung Im, Jong-Pil Ahn, Kwang-Sung Kim, Seon-Young Park, Dong-Il |
author_sort | Park, Soo-Kyung |
collection | PubMed |
description | Crohn’s disease (CD) and ulcerative colitis (UC) can be difficult to differentiate. As differential diagnosis is important in establishing a long-term treatment plan for patients, we aimed to develop a machine learning model for the differential diagnosis of the two diseases using RNA sequencing (RNA-seq) data from endoscopic biopsy tissue from patients with inflammatory bowel disease (n = 127; CD, 94; UC, 33). Biopsy samples were taken from inflammatory lesions or normal tissues. The RNA-seq dataset was processed via mapping to the human reference genome (GRCh38) and quantifying the corresponding gene models that comprised 19,596 protein-coding genes. An unsupervised learning model showed distinct clusters of four classes: CD inflammatory, CD normal, UC inflammatory, and UC normal. A supervised learning model based on partial least squares discriminant analysis was able to distinguish inflammatory CD from inflammatory UC after pruning the strong classifiers of normal CD vs. normal UC. The error rate was minimal and affected only two components: 20 and 50 genes for the first and second components, respectively. The corresponding overall error rate was 0.147. RNA-seq analysis of tissue and the two components revealed in this study may be helpful for distinguishing CD from UC. |
format | Online Article Text |
id | pubmed-8700628 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-87006282021-12-24 Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data Park, Soo-Kyung Kim, Sangsoo Lee, Gi-Young Kim, Sung-Yoon Kim, Wan Lee, Chil-Woo Park, Jong-Lyul Choi, Chang-Hwan Kang, Sang-Bum Kim, Tae-Oh Bang, Ki-Bae Chun, Jaeyoung Cha, Jae-Myung Im, Jong-Pil Ahn, Kwang-Sung Kim, Seon-Young Park, Dong-Il Diagnostics (Basel) Article Crohn’s disease (CD) and ulcerative colitis (UC) can be difficult to differentiate. As differential diagnosis is important in establishing a long-term treatment plan for patients, we aimed to develop a machine learning model for the differential diagnosis of the two diseases using RNA sequencing (RNA-seq) data from endoscopic biopsy tissue from patients with inflammatory bowel disease (n = 127; CD, 94; UC, 33). Biopsy samples were taken from inflammatory lesions or normal tissues. The RNA-seq dataset was processed via mapping to the human reference genome (GRCh38) and quantifying the corresponding gene models that comprised 19,596 protein-coding genes. An unsupervised learning model showed distinct clusters of four classes: CD inflammatory, CD normal, UC inflammatory, and UC normal. A supervised learning model based on partial least squares discriminant analysis was able to distinguish inflammatory CD from inflammatory UC after pruning the strong classifiers of normal CD vs. normal UC. The error rate was minimal and affected only two components: 20 and 50 genes for the first and second components, respectively. The corresponding overall error rate was 0.147. RNA-seq analysis of tissue and the two components revealed in this study may be helpful for distinguishing CD from UC. MDPI 2021-12-15 /pmc/articles/PMC8700628/ /pubmed/34943601 http://dx.doi.org/10.3390/diagnostics11122365 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Park, Soo-Kyung Kim, Sangsoo Lee, Gi-Young Kim, Sung-Yoon Kim, Wan Lee, Chil-Woo Park, Jong-Lyul Choi, Chang-Hwan Kang, Sang-Bum Kim, Tae-Oh Bang, Ki-Bae Chun, Jaeyoung Cha, Jae-Myung Im, Jong-Pil Ahn, Kwang-Sung Kim, Seon-Young Park, Dong-Il Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data |
title | Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data |
title_full | Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data |
title_fullStr | Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data |
title_full_unstemmed | Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data |
title_short | Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn’s Disease Using RNA Sequencing Data |
title_sort | development of a machine learning model to distinguish between ulcerative colitis and crohn’s disease using rna sequencing data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8700628/ https://www.ncbi.nlm.nih.gov/pubmed/34943601 http://dx.doi.org/10.3390/diagnostics11122365 |
work_keys_str_mv | AT parksookyung developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT kimsangsoo developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT leegiyoung developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT kimsungyoon developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT kimwan developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT leechilwoo developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT parkjonglyul developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT choichanghwan developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT kangsangbum developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT kimtaeoh developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT bangkibae developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT chunjaeyoung developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT chajaemyung developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT imjongpil developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT ahnkwangsung developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT kimseonyoung developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata AT parkdongil developmentofamachinelearningmodeltodistinguishbetweenulcerativecolitisandcrohnsdiseaseusingrnasequencingdata |