Cargando…

Exploiting machine learning models to identify novel Alzheimer’s disease biomarkers and potential targets

We still do not have an effective treatment for Alzheimer's disease (AD) despite it being the most common cause of dementia and impaired cognitive function. Thus, research endeavors are directed toward identifying AD biomarkers and targets. In this regard, we designed a computational method tha...

Descripción completa

Detalles Bibliográficos
Autores principales: Alamro, Hind, Thafar, Maha A., Albaradei, Somayah, Gojobori, Takashi, Essack, Magbubah, Gao, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10043000/
https://www.ncbi.nlm.nih.gov/pubmed/36973386
http://dx.doi.org/10.1038/s41598-023-30904-5
_version_ 1784913055534022656
author Alamro, Hind
Thafar, Maha A.
Albaradei, Somayah
Gojobori, Takashi
Essack, Magbubah
Gao, Xin
author_facet Alamro, Hind
Thafar, Maha A.
Albaradei, Somayah
Gojobori, Takashi
Essack, Magbubah
Gao, Xin
author_sort Alamro, Hind
collection PubMed
description We still do not have an effective treatment for Alzheimer's disease (AD) despite it being the most common cause of dementia and impaired cognitive function. Thus, research endeavors are directed toward identifying AD biomarkers and targets. In this regard, we designed a computational method that exploits multiple hub gene ranking methods and feature selection methods with machine learning and deep learning to identify biomarkers and targets. First, we used three AD gene expression datasets to identify 1/ hub genes based on six ranking algorithms (Degree, Maximum Neighborhood Component (MNC), Maximal Clique Centrality (MCC), Betweenness Centrality (BC), Closeness Centrality, and Stress Centrality), 2/ gene subsets based on two feature selection methods (LASSO and Ridge). Then, we developed machine learning and deep learning models to determine the gene subset that best distinguishes AD samples from the healthy controls. This work shows that feature selection methods achieve better prediction performances than the hub gene sets. Beyond this, the five genes identified by both feature selection methods (LASSO and Ridge algorithms) achieved an AUC = 0.979. We further show that 70% of the upregulated hub genes (among the 28 overlapping hub genes) are AD targets based on a literature review and six miRNA (hsa-mir-16-5p, hsa-mir-34a-5p, hsa-mir-1-3p, hsa-mir-26a-5p, hsa-mir-93-5p, hsa-mir-155-5p) and one transcription factor, JUN, are associated with the upregulated hub genes. Furthermore, since 2020, four of the six microRNA were also shown to be potential AD targets. To our knowledge, this is the first work showing that such a small number of genes can distinguish AD samples from healthy controls with high accuracy and that overlapping upregulated hub genes can narrow the search space for potential novel targets.
format Online
Article
Text
id pubmed-10043000
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-100430002023-03-29 Exploiting machine learning models to identify novel Alzheimer’s disease biomarkers and potential targets Alamro, Hind Thafar, Maha A. Albaradei, Somayah Gojobori, Takashi Essack, Magbubah Gao, Xin Sci Rep Article We still do not have an effective treatment for Alzheimer's disease (AD) despite it being the most common cause of dementia and impaired cognitive function. Thus, research endeavors are directed toward identifying AD biomarkers and targets. In this regard, we designed a computational method that exploits multiple hub gene ranking methods and feature selection methods with machine learning and deep learning to identify biomarkers and targets. First, we used three AD gene expression datasets to identify 1/ hub genes based on six ranking algorithms (Degree, Maximum Neighborhood Component (MNC), Maximal Clique Centrality (MCC), Betweenness Centrality (BC), Closeness Centrality, and Stress Centrality), 2/ gene subsets based on two feature selection methods (LASSO and Ridge). Then, we developed machine learning and deep learning models to determine the gene subset that best distinguishes AD samples from the healthy controls. This work shows that feature selection methods achieve better prediction performances than the hub gene sets. Beyond this, the five genes identified by both feature selection methods (LASSO and Ridge algorithms) achieved an AUC = 0.979. We further show that 70% of the upregulated hub genes (among the 28 overlapping hub genes) are AD targets based on a literature review and six miRNA (hsa-mir-16-5p, hsa-mir-34a-5p, hsa-mir-1-3p, hsa-mir-26a-5p, hsa-mir-93-5p, hsa-mir-155-5p) and one transcription factor, JUN, are associated with the upregulated hub genes. Furthermore, since 2020, four of the six microRNA were also shown to be potential AD targets. To our knowledge, this is the first work showing that such a small number of genes can distinguish AD samples from healthy controls with high accuracy and that overlapping upregulated hub genes can narrow the search space for potential novel targets. Nature Publishing Group UK 2023-03-27 /pmc/articles/PMC10043000/ /pubmed/36973386 http://dx.doi.org/10.1038/s41598-023-30904-5 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Alamro, Hind
Thafar, Maha A.
Albaradei, Somayah
Gojobori, Takashi
Essack, Magbubah
Gao, Xin
Exploiting machine learning models to identify novel Alzheimer’s disease biomarkers and potential targets
title Exploiting machine learning models to identify novel Alzheimer’s disease biomarkers and potential targets
title_full Exploiting machine learning models to identify novel Alzheimer’s disease biomarkers and potential targets
title_fullStr Exploiting machine learning models to identify novel Alzheimer’s disease biomarkers and potential targets
title_full_unstemmed Exploiting machine learning models to identify novel Alzheimer’s disease biomarkers and potential targets
title_short Exploiting machine learning models to identify novel Alzheimer’s disease biomarkers and potential targets
title_sort exploiting machine learning models to identify novel alzheimer’s disease biomarkers and potential targets
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10043000/
https://www.ncbi.nlm.nih.gov/pubmed/36973386
http://dx.doi.org/10.1038/s41598-023-30904-5
work_keys_str_mv AT alamrohind exploitingmachinelearningmodelstoidentifynovelalzheimersdiseasebiomarkersandpotentialtargets
AT thafarmahaa exploitingmachinelearningmodelstoidentifynovelalzheimersdiseasebiomarkersandpotentialtargets
AT albaradeisomayah exploitingmachinelearningmodelstoidentifynovelalzheimersdiseasebiomarkersandpotentialtargets
AT gojoboritakashi exploitingmachinelearningmodelstoidentifynovelalzheimersdiseasebiomarkersandpotentialtargets
AT essackmagbubah exploitingmachinelearningmodelstoidentifynovelalzheimersdiseasebiomarkersandpotentialtargets
AT gaoxin exploitingmachinelearningmodelstoidentifynovelalzheimersdiseasebiomarkersandpotentialtargets