Cargando…
HEC-ASD: a hybrid ensemble-based classification model for predicting autism spectrum disorder disease genes
PURPOSE: Autism spectrum disorder (ASD) is the most prevalent disease today. The causes of its infection may be attributed to genetic causes by 80% and environmental causes by 20%. In spite of this, the majority of the current research is concerned with environmental causes, and the least proportion...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9768984/ https://www.ncbi.nlm.nih.gov/pubmed/36544099 http://dx.doi.org/10.1186/s12859-022-05099-7 |
_version_ | 1784854287989342208 |
---|---|
author | Ismail, Eman Gad, Walaa Hashem, Mohamed |
author_facet | Ismail, Eman Gad, Walaa Hashem, Mohamed |
author_sort | Ismail, Eman |
collection | PubMed |
description | PURPOSE: Autism spectrum disorder (ASD) is the most prevalent disease today. The causes of its infection may be attributed to genetic causes by 80% and environmental causes by 20%. In spite of this, the majority of the current research is concerned with environmental causes, and the least proportion with the genetic causes of the disease. Autism is a complex disease, which makes it difficult to identify the genes that cause the disease. METHODS: Hybrid ensemble-based classification (HEC-ASD) model for predicting ASD genes using gradient boosting machines is proposed. The proposed model utilizes gene ontology (GO) to construct a gene functional similarity matrix using hybrid gene similarity (HGS) method. HGS measures the semantic similarity between genes effectively. It combines the graph-based method, such as Wang method with the number of directed children’s nodes of gene term from GO. Moreover, an ensemble gradient boosting classifier is adapted to enhance the prediction of genes forming a robust classification model. RESULTS: The proposed model is evaluated using the Simons Foundation Autism Research Initiative (SFARI) gene database. The experimental results are promising as they improve the classification performance for predicting ASD genes. The results are compared with other approaches that used gene regulatory network (GRN), protein to protein interaction network (PPI), or GO. The HEC-ASD model reaches the highest prediction accuracy of 0.88% using ensemble learning classifiers. CONCLUSION: The proposed model demonstrates that ensemble learning technique using gradient boosting is effective in predicting autism spectrum disorder genes. Moreover, the HEC-ASD model utilized GO rather than using PPI network and GRN. |
format | Online Article Text |
id | pubmed-9768984 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-97689842022-12-22 HEC-ASD: a hybrid ensemble-based classification model for predicting autism spectrum disorder disease genes Ismail, Eman Gad, Walaa Hashem, Mohamed BMC Bioinformatics Research PURPOSE: Autism spectrum disorder (ASD) is the most prevalent disease today. The causes of its infection may be attributed to genetic causes by 80% and environmental causes by 20%. In spite of this, the majority of the current research is concerned with environmental causes, and the least proportion with the genetic causes of the disease. Autism is a complex disease, which makes it difficult to identify the genes that cause the disease. METHODS: Hybrid ensemble-based classification (HEC-ASD) model for predicting ASD genes using gradient boosting machines is proposed. The proposed model utilizes gene ontology (GO) to construct a gene functional similarity matrix using hybrid gene similarity (HGS) method. HGS measures the semantic similarity between genes effectively. It combines the graph-based method, such as Wang method with the number of directed children’s nodes of gene term from GO. Moreover, an ensemble gradient boosting classifier is adapted to enhance the prediction of genes forming a robust classification model. RESULTS: The proposed model is evaluated using the Simons Foundation Autism Research Initiative (SFARI) gene database. The experimental results are promising as they improve the classification performance for predicting ASD genes. The results are compared with other approaches that used gene regulatory network (GRN), protein to protein interaction network (PPI), or GO. The HEC-ASD model reaches the highest prediction accuracy of 0.88% using ensemble learning classifiers. CONCLUSION: The proposed model demonstrates that ensemble learning technique using gradient boosting is effective in predicting autism spectrum disorder genes. Moreover, the HEC-ASD model utilized GO rather than using PPI network and GRN. BioMed Central 2022-12-21 /pmc/articles/PMC9768984/ /pubmed/36544099 http://dx.doi.org/10.1186/s12859-022-05099-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Ismail, Eman Gad, Walaa Hashem, Mohamed HEC-ASD: a hybrid ensemble-based classification model for predicting autism spectrum disorder disease genes |
title | HEC-ASD: a hybrid ensemble-based classification model for predicting autism spectrum disorder disease genes |
title_full | HEC-ASD: a hybrid ensemble-based classification model for predicting autism spectrum disorder disease genes |
title_fullStr | HEC-ASD: a hybrid ensemble-based classification model for predicting autism spectrum disorder disease genes |
title_full_unstemmed | HEC-ASD: a hybrid ensemble-based classification model for predicting autism spectrum disorder disease genes |
title_short | HEC-ASD: a hybrid ensemble-based classification model for predicting autism spectrum disorder disease genes |
title_sort | hec-asd: a hybrid ensemble-based classification model for predicting autism spectrum disorder disease genes |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9768984/ https://www.ncbi.nlm.nih.gov/pubmed/36544099 http://dx.doi.org/10.1186/s12859-022-05099-7 |
work_keys_str_mv | AT ismaileman hecasdahybridensemblebasedclassificationmodelforpredictingautismspectrumdisorderdiseasegenes AT gadwalaa hecasdahybridensemblebasedclassificationmodelforpredictingautismspectrumdisorderdiseasegenes AT hashemmohamed hecasdahybridensemblebasedclassificationmodelforpredictingautismspectrumdisorderdiseasegenes |