Cargando…

A computational workflow for the detection of candidate diagnostic biomarkers of Kawasaki disease using time-series gene expression data

Unlike autoimmune diseases, there is no known constitutive and disease-defining biomarker for systemic autoinflammatory diseases (SAIDs). Kawasaki disease (KD) is one of the “undiagnosed” types of SAIDs whose pathogenic mechanism and gene mutation still remain unknown. To address this issue, we have...

Descripción completa

Detalles Bibliográficos
Autores principales: Pezoulas, Vasileios C., Papaloukas, Costas, Veyssiere, Maëva, Goules, Andreas, Tzioufas, Athanasios G., Soumelis, Vassili, Fotiadis, Dimitrios I.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8178098/
https://www.ncbi.nlm.nih.gov/pubmed/34136104
http://dx.doi.org/10.1016/j.csbj.2021.05.036
_version_ 1783703518822006784
author Pezoulas, Vasileios C.
Papaloukas, Costas
Veyssiere, Maëva
Goules, Andreas
Tzioufas, Athanasios G.
Soumelis, Vassili
Fotiadis, Dimitrios I.
author_facet Pezoulas, Vasileios C.
Papaloukas, Costas
Veyssiere, Maëva
Goules, Andreas
Tzioufas, Athanasios G.
Soumelis, Vassili
Fotiadis, Dimitrios I.
author_sort Pezoulas, Vasileios C.
collection PubMed
description Unlike autoimmune diseases, there is no known constitutive and disease-defining biomarker for systemic autoinflammatory diseases (SAIDs). Kawasaki disease (KD) is one of the “undiagnosed” types of SAIDs whose pathogenic mechanism and gene mutation still remain unknown. To address this issue, we have developed a sequential computational workflow which clusters KD patients with similar gene expression profiles across the three different KD phases (Acute, Subacute and Convalescent) and utilizes the resulting clustermap to detect prominent genes that can be used as diagnostic biomarkers for KD. Self-Organizing Maps (SOMs) were employed to cluster patients with similar gene expressions across the three phases through inter-phase and intra-phase clustering. Then, false discovery rate (FDR)-based feature selection was applied to detect genes that significantly deviate across the per-phase clusters. Our results revealed five genes as candidate biomarkers for KD diagnosis, namely, the HLA-DQB1, HLA-DRA, ZBTB48, TNFRSF13C, and CASD1. To our knowledge, these five genes are reported for the first time in the literature. The impact of the discovered genes for KD diagnosis against the known ones was demonstrated by training boosting ensembles (AdaBoost and XGBoost) for KD classification on common platform and cross-platform datasets. The classifiers which were trained on the proposed genes from the common platform data yielded an average increase by 4.40% in accuracy, 5.52% in sensitivity, and 3.57% in specificity than the known genes in the Acute and Subacute phases, followed by a notable increase by 2.30% in accuracy, 2.20% in sensitivity, and 4.70% in specificity in the cross-platform analysis.
format Online
Article
Text
id pubmed-8178098
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-81780982021-06-15 A computational workflow for the detection of candidate diagnostic biomarkers of Kawasaki disease using time-series gene expression data Pezoulas, Vasileios C. Papaloukas, Costas Veyssiere, Maëva Goules, Andreas Tzioufas, Athanasios G. Soumelis, Vassili Fotiadis, Dimitrios I. Comput Struct Biotechnol J Research Article Unlike autoimmune diseases, there is no known constitutive and disease-defining biomarker for systemic autoinflammatory diseases (SAIDs). Kawasaki disease (KD) is one of the “undiagnosed” types of SAIDs whose pathogenic mechanism and gene mutation still remain unknown. To address this issue, we have developed a sequential computational workflow which clusters KD patients with similar gene expression profiles across the three different KD phases (Acute, Subacute and Convalescent) and utilizes the resulting clustermap to detect prominent genes that can be used as diagnostic biomarkers for KD. Self-Organizing Maps (SOMs) were employed to cluster patients with similar gene expressions across the three phases through inter-phase and intra-phase clustering. Then, false discovery rate (FDR)-based feature selection was applied to detect genes that significantly deviate across the per-phase clusters. Our results revealed five genes as candidate biomarkers for KD diagnosis, namely, the HLA-DQB1, HLA-DRA, ZBTB48, TNFRSF13C, and CASD1. To our knowledge, these five genes are reported for the first time in the literature. The impact of the discovered genes for KD diagnosis against the known ones was demonstrated by training boosting ensembles (AdaBoost and XGBoost) for KD classification on common platform and cross-platform datasets. The classifiers which were trained on the proposed genes from the common platform data yielded an average increase by 4.40% in accuracy, 5.52% in sensitivity, and 3.57% in specificity than the known genes in the Acute and Subacute phases, followed by a notable increase by 2.30% in accuracy, 2.20% in sensitivity, and 4.70% in specificity in the cross-platform analysis. Research Network of Computational and Structural Biotechnology 2021-05-24 /pmc/articles/PMC8178098/ /pubmed/34136104 http://dx.doi.org/10.1016/j.csbj.2021.05.036 Text en © 2021 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Research Article
Pezoulas, Vasileios C.
Papaloukas, Costas
Veyssiere, Maëva
Goules, Andreas
Tzioufas, Athanasios G.
Soumelis, Vassili
Fotiadis, Dimitrios I.
A computational workflow for the detection of candidate diagnostic biomarkers of Kawasaki disease using time-series gene expression data
title A computational workflow for the detection of candidate diagnostic biomarkers of Kawasaki disease using time-series gene expression data
title_full A computational workflow for the detection of candidate diagnostic biomarkers of Kawasaki disease using time-series gene expression data
title_fullStr A computational workflow for the detection of candidate diagnostic biomarkers of Kawasaki disease using time-series gene expression data
title_full_unstemmed A computational workflow for the detection of candidate diagnostic biomarkers of Kawasaki disease using time-series gene expression data
title_short A computational workflow for the detection of candidate diagnostic biomarkers of Kawasaki disease using time-series gene expression data
title_sort computational workflow for the detection of candidate diagnostic biomarkers of kawasaki disease using time-series gene expression data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8178098/
https://www.ncbi.nlm.nih.gov/pubmed/34136104
http://dx.doi.org/10.1016/j.csbj.2021.05.036
work_keys_str_mv AT pezoulasvasileiosc acomputationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT papaloukascostas acomputationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT veyssieremaeva acomputationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT goulesandreas acomputationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT tzioufasathanasiosg acomputationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT soumelisvassili acomputationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT fotiadisdimitriosi acomputationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT pezoulasvasileiosc computationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT papaloukascostas computationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT veyssieremaeva computationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT goulesandreas computationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT tzioufasathanasiosg computationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT soumelisvassili computationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata
AT fotiadisdimitriosi computationalworkflowforthedetectionofcandidatediagnosticbiomarkersofkawasakidiseaseusingtimeseriesgeneexpressiondata