Cargando…

Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data

We investigate the feasibility of molecular-level sample classification of sepsis using microarray gene expression data merged by in silico meta-analysis. Publicly available data series were extracted from NCBI Gene Expression Omnibus and EMBL-EBI ArrayExpress to create a comprehensive meta-analysis...

Descripción completa

Detalles Bibliográficos
Autores principales: Schaack, Dominik, Weigand, Markus A., Uhle, Florian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8128240/
https://www.ncbi.nlm.nih.gov/pubmed/33999966
http://dx.doi.org/10.1371/journal.pone.0251800
_version_ 1783694081570897920
author Schaack, Dominik
Weigand, Markus A.
Uhle, Florian
author_facet Schaack, Dominik
Weigand, Markus A.
Uhle, Florian
author_sort Schaack, Dominik
collection PubMed
description We investigate the feasibility of molecular-level sample classification of sepsis using microarray gene expression data merged by in silico meta-analysis. Publicly available data series were extracted from NCBI Gene Expression Omnibus and EMBL-EBI ArrayExpress to create a comprehensive meta-analysis microarray expression set (meta-expression set). Measurements had to be obtained via microarray-technique from whole blood samples of adult or pediatric patients with sepsis diagnosed based on international consensus definition immediately after admission to the intensive care unit. We aggregate trauma patients, systemic inflammatory response syndrome (SIRS) patients, and healthy controls in a non-septic entity. Differential expression (DE) analysis is compared with machine-learning-based solutions like decision tree (DT), random forest (RF), support vector machine (SVM), and deep-learning neural networks (DNNs). We evaluated classifier training and discrimination performance in 100 independent iterations. To test diagnostic resilience, we gradually degraded expression data in multiple levels. Clustering of expression values based on DE genes results in partial identification of sepsis samples. In contrast, RF, SVM, and DNN provide excellent diagnostic performance measured in terms of accuracy and area under the curve (>0.96 and >0.99, respectively). We prove DNNs as the most resilient methodology, virtually unaffected by targeted removal of DE genes. By surpassing most other published solutions, the presented approach substantially augments current diagnostic capability in intensive care medicine.
format Online
Article
Text
id pubmed-8128240
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-81282402021-05-27 Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data Schaack, Dominik Weigand, Markus A. Uhle, Florian PLoS One Research Article We investigate the feasibility of molecular-level sample classification of sepsis using microarray gene expression data merged by in silico meta-analysis. Publicly available data series were extracted from NCBI Gene Expression Omnibus and EMBL-EBI ArrayExpress to create a comprehensive meta-analysis microarray expression set (meta-expression set). Measurements had to be obtained via microarray-technique from whole blood samples of adult or pediatric patients with sepsis diagnosed based on international consensus definition immediately after admission to the intensive care unit. We aggregate trauma patients, systemic inflammatory response syndrome (SIRS) patients, and healthy controls in a non-septic entity. Differential expression (DE) analysis is compared with machine-learning-based solutions like decision tree (DT), random forest (RF), support vector machine (SVM), and deep-learning neural networks (DNNs). We evaluated classifier training and discrimination performance in 100 independent iterations. To test diagnostic resilience, we gradually degraded expression data in multiple levels. Clustering of expression values based on DE genes results in partial identification of sepsis samples. In contrast, RF, SVM, and DNN provide excellent diagnostic performance measured in terms of accuracy and area under the curve (>0.96 and >0.99, respectively). We prove DNNs as the most resilient methodology, virtually unaffected by targeted removal of DE genes. By surpassing most other published solutions, the presented approach substantially augments current diagnostic capability in intensive care medicine. Public Library of Science 2021-05-17 /pmc/articles/PMC8128240/ /pubmed/33999966 http://dx.doi.org/10.1371/journal.pone.0251800 Text en © 2021 Schaack et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Schaack, Dominik
Weigand, Markus A.
Uhle, Florian
Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data
title Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data
title_full Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data
title_fullStr Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data
title_full_unstemmed Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data
title_short Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data
title_sort comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8128240/
https://www.ncbi.nlm.nih.gov/pubmed/33999966
http://dx.doi.org/10.1371/journal.pone.0251800
work_keys_str_mv AT schaackdominik comparisonofmachinelearningmethodologiesforaccuratediagnosisofsepsisusingmicroarraygeneexpressiondata
AT weigandmarkusa comparisonofmachinelearningmethodologiesforaccuratediagnosisofsepsisusingmicroarraygeneexpressiondata
AT uhleflorian comparisonofmachinelearningmethodologiesforaccuratediagnosisofsepsisusingmicroarraygeneexpressiondata