Cargando…
An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers
Pulmonary fibrosing diseases are in the very epicenter of biomedical research both due to their increasing prevalence and their association with SARS-CoV-2 infections. Research of idiopathic pulmonary fibrosis, the most lethal among the interstitial lung diseases, is in need for new biomarkers and p...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10049879/ https://www.ncbi.nlm.nih.gov/pubmed/37007651 http://dx.doi.org/10.1016/j.csbj.2023.03.043 |
_version_ | 1785014554279804928 |
---|---|
author | Fanidis, Dionysios Pezoulas, Vasileios C. Fotiadis, Dimitrios Ι. Aidinis, Vassilis |
author_facet | Fanidis, Dionysios Pezoulas, Vasileios C. Fotiadis, Dimitrios Ι. Aidinis, Vassilis |
author_sort | Fanidis, Dionysios |
collection | PubMed |
description | Pulmonary fibrosing diseases are in the very epicenter of biomedical research both due to their increasing prevalence and their association with SARS-CoV-2 infections. Research of idiopathic pulmonary fibrosis, the most lethal among the interstitial lung diseases, is in need for new biomarkers and potential disease targets, a goal that could be accelerated using machine learning techniques. In this study, we have used Shapley values to explain the decisions made by an ensemble learning model trained to classify samples to an either pulmonary fibrosis or steady state based on the expression values of deregulated genes. This process resulted in a full and a laconic set of features capable of separating phenotypes to an at least equal degree as previously published marker sets. Indicatively, a maximum increase of 6% in specificity and 5% in Mathew’s correlation coefficient was achieved. Evaluation with an additional independent dataset showed our feature set having a greater generalization potential than the rest. Ultimately, the proposed gene lists are expected not only to serve as new sets of diagnostic marker elements, but also as a target pool for future research initiatives. |
format | Online Article Text |
id | pubmed-10049879 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-100498792023-03-29 An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers Fanidis, Dionysios Pezoulas, Vasileios C. Fotiadis, Dimitrios Ι. Aidinis, Vassilis Comput Struct Biotechnol J Research Article Pulmonary fibrosing diseases are in the very epicenter of biomedical research both due to their increasing prevalence and their association with SARS-CoV-2 infections. Research of idiopathic pulmonary fibrosis, the most lethal among the interstitial lung diseases, is in need for new biomarkers and potential disease targets, a goal that could be accelerated using machine learning techniques. In this study, we have used Shapley values to explain the decisions made by an ensemble learning model trained to classify samples to an either pulmonary fibrosis or steady state based on the expression values of deregulated genes. This process resulted in a full and a laconic set of features capable of separating phenotypes to an at least equal degree as previously published marker sets. Indicatively, a maximum increase of 6% in specificity and 5% in Mathew’s correlation coefficient was achieved. Evaluation with an additional independent dataset showed our feature set having a greater generalization potential than the rest. Ultimately, the proposed gene lists are expected not only to serve as new sets of diagnostic marker elements, but also as a target pool for future research initiatives. Research Network of Computational and Structural Biotechnology 2023-03-25 /pmc/articles/PMC10049879/ /pubmed/37007651 http://dx.doi.org/10.1016/j.csbj.2023.03.043 Text en © 2023 The Authors |
spellingShingle | Research Article Fanidis, Dionysios Pezoulas, Vasileios C. Fotiadis, Dimitrios Ι. Aidinis, Vassilis An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers |
title | An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers |
title_full | An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers |
title_fullStr | An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers |
title_full_unstemmed | An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers |
title_short | An explainable machine learning-driven proposal of pulmonary fibrosis biomarkers |
title_sort | explainable machine learning-driven proposal of pulmonary fibrosis biomarkers |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10049879/ https://www.ncbi.nlm.nih.gov/pubmed/37007651 http://dx.doi.org/10.1016/j.csbj.2023.03.043 |
work_keys_str_mv | AT fanidisdionysios anexplainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT pezoulasvasileiosc anexplainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT fotiadisdimitriosi anexplainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT aidinisvassilis anexplainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT fanidisdionysios explainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT pezoulasvasileiosc explainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT fotiadisdimitriosi explainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers AT aidinisvassilis explainablemachinelearningdrivenproposalofpulmonaryfibrosisbiomarkers |