Cargando…
Machine Learning-Based Identification of Potentially Novel Non-Alcoholic Fatty Liver Disease Biomarkers
Non-alcoholic fatty liver disease (NAFLD) is a chronic liver disease that presents a great challenge for treatment and prevention.. This study aims to implement a machine learning approach that employs such datasets to identify potential biomarker targets. We developed a pipeline to identify potenti...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8615894/ https://www.ncbi.nlm.nih.gov/pubmed/34829865 http://dx.doi.org/10.3390/biomedicines9111636 |
_version_ | 1784604216066572288 |
---|---|
author | Shafiha, Roshan Bahcivanci, Basak Gkoutos, Georgios V. Acharjee, Animesh |
author_facet | Shafiha, Roshan Bahcivanci, Basak Gkoutos, Georgios V. Acharjee, Animesh |
author_sort | Shafiha, Roshan |
collection | PubMed |
description | Non-alcoholic fatty liver disease (NAFLD) is a chronic liver disease that presents a great challenge for treatment and prevention.. This study aims to implement a machine learning approach that employs such datasets to identify potential biomarker targets. We developed a pipeline to identify potential biomarkers for NAFLD that includes five major processes, namely, a pre-processing step, a feature selection and a generation of a random forest model and, finally, a downstream feature analysis and a provision of a potential biological interpretation. The pre-processing step includes data normalising and variable extraction accompanied by appropriate annotations. A feature selection based on a differential gene expression analysis is then conducted to identify significant features and then employ them to generate a random forest model whose performance is assessed based on a receiver operating characteristic curve. Next, the features are subjected to a downstream analysis, such as univariate analysis, a pathway enrichment analysis, a network analysis and a generation of correlation plots, boxplots and heatmaps. Once the results are obtained, the biological interpretation and the literature validation is conducted over the identified features and results. We applied this pipeline to transcriptomics and lipidomic datasets and concluded that the C4BPA gene could play a role in the development of NAFLD. The activation of the complement pathway, due to the downregulation of the C4BPA gene, leads to an increase in triglyceride content, which might further render the lipid metabolism. This approach identified the C4BPA gene, an inhibitor of the complement pathway, as a potential biomarker for the development of NAFLD. |
format | Online Article Text |
id | pubmed-8615894 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-86158942021-11-26 Machine Learning-Based Identification of Potentially Novel Non-Alcoholic Fatty Liver Disease Biomarkers Shafiha, Roshan Bahcivanci, Basak Gkoutos, Georgios V. Acharjee, Animesh Biomedicines Article Non-alcoholic fatty liver disease (NAFLD) is a chronic liver disease that presents a great challenge for treatment and prevention.. This study aims to implement a machine learning approach that employs such datasets to identify potential biomarker targets. We developed a pipeline to identify potential biomarkers for NAFLD that includes five major processes, namely, a pre-processing step, a feature selection and a generation of a random forest model and, finally, a downstream feature analysis and a provision of a potential biological interpretation. The pre-processing step includes data normalising and variable extraction accompanied by appropriate annotations. A feature selection based on a differential gene expression analysis is then conducted to identify significant features and then employ them to generate a random forest model whose performance is assessed based on a receiver operating characteristic curve. Next, the features are subjected to a downstream analysis, such as univariate analysis, a pathway enrichment analysis, a network analysis and a generation of correlation plots, boxplots and heatmaps. Once the results are obtained, the biological interpretation and the literature validation is conducted over the identified features and results. We applied this pipeline to transcriptomics and lipidomic datasets and concluded that the C4BPA gene could play a role in the development of NAFLD. The activation of the complement pathway, due to the downregulation of the C4BPA gene, leads to an increase in triglyceride content, which might further render the lipid metabolism. This approach identified the C4BPA gene, an inhibitor of the complement pathway, as a potential biomarker for the development of NAFLD. MDPI 2021-11-07 /pmc/articles/PMC8615894/ /pubmed/34829865 http://dx.doi.org/10.3390/biomedicines9111636 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Shafiha, Roshan Bahcivanci, Basak Gkoutos, Georgios V. Acharjee, Animesh Machine Learning-Based Identification of Potentially Novel Non-Alcoholic Fatty Liver Disease Biomarkers |
title | Machine Learning-Based Identification of Potentially Novel Non-Alcoholic Fatty Liver Disease Biomarkers |
title_full | Machine Learning-Based Identification of Potentially Novel Non-Alcoholic Fatty Liver Disease Biomarkers |
title_fullStr | Machine Learning-Based Identification of Potentially Novel Non-Alcoholic Fatty Liver Disease Biomarkers |
title_full_unstemmed | Machine Learning-Based Identification of Potentially Novel Non-Alcoholic Fatty Liver Disease Biomarkers |
title_short | Machine Learning-Based Identification of Potentially Novel Non-Alcoholic Fatty Liver Disease Biomarkers |
title_sort | machine learning-based identification of potentially novel non-alcoholic fatty liver disease biomarkers |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8615894/ https://www.ncbi.nlm.nih.gov/pubmed/34829865 http://dx.doi.org/10.3390/biomedicines9111636 |
work_keys_str_mv | AT shafiharoshan machinelearningbasedidentificationofpotentiallynovelnonalcoholicfattyliverdiseasebiomarkers AT bahcivancibasak machinelearningbasedidentificationofpotentiallynovelnonalcoholicfattyliverdiseasebiomarkers AT gkoutosgeorgiosv machinelearningbasedidentificationofpotentiallynovelnonalcoholicfattyliverdiseasebiomarkers AT acharjeeanimesh machinelearningbasedidentificationofpotentiallynovelnonalcoholicfattyliverdiseasebiomarkers |