Cargando…
Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms
Objectives: The identification of subgroups of autism spectrum disorder (ASD) may partially remedy the problems of clinical heterogeneity to facilitate the improvement of clinical management. The current study aims to use machine learning algorithms to analyze microarray data to identify clusters wi...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8149626/ https://www.ncbi.nlm.nih.gov/pubmed/34054599 http://dx.doi.org/10.3389/fpsyt.2021.637022 |
_version_ | 1783697986413395968 |
---|---|
author | Lin, Ping-I Moni, Mohammad Ali Gau, Susan Shur-Fen Eapen, Valsamma |
author_facet | Lin, Ping-I Moni, Mohammad Ali Gau, Susan Shur-Fen Eapen, Valsamma |
author_sort | Lin, Ping-I |
collection | PubMed |
description | Objectives: The identification of subgroups of autism spectrum disorder (ASD) may partially remedy the problems of clinical heterogeneity to facilitate the improvement of clinical management. The current study aims to use machine learning algorithms to analyze microarray data to identify clusters with relatively homogeneous clinical features. Methods: The whole-genome gene expression microarray data were used to predict communication quotient (SCQ) scores against all probes to select differential expression regions (DERs). Gene set enrichment analysis was performed for DERs with a fold-change >2 to identify hub pathways that play a role in the severity of social communication deficits inherent to ASD. We then used two machine learning methods, random forest classification (RF) and support vector machine (SVM), to identify two clusters using DERs. Finally, we evaluated how accurately the clusters predicted language impairment. Results: A total of 191 DERs were initially identified, and 54 of them with a fold-change >2 were selected for the pathway analysis. Cholesterol biosynthesis and metabolisms pathways appear to act as hubs that connect other trait-associated pathways to influence the severity of social communication deficits inherent to ASD. Both RF and SVM algorithms can yield a classification accuracy level >90% when all 191 DERs were analyzed. The ASD subtypes defined by the presence of language impairment, a strong indicator for prognosis, can be predicted by transcriptomic profiles associated with social communication deficits and cholesterol biosynthesis and metabolism. Conclusion: The results suggest that both RF and SVM are acceptable options for machine learning algorithms to identify AD subgroups characterized by clinical homogeneity related to prognosis. |
format | Online Article Text |
id | pubmed-8149626 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-81496262021-05-27 Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms Lin, Ping-I Moni, Mohammad Ali Gau, Susan Shur-Fen Eapen, Valsamma Front Psychiatry Psychiatry Objectives: The identification of subgroups of autism spectrum disorder (ASD) may partially remedy the problems of clinical heterogeneity to facilitate the improvement of clinical management. The current study aims to use machine learning algorithms to analyze microarray data to identify clusters with relatively homogeneous clinical features. Methods: The whole-genome gene expression microarray data were used to predict communication quotient (SCQ) scores against all probes to select differential expression regions (DERs). Gene set enrichment analysis was performed for DERs with a fold-change >2 to identify hub pathways that play a role in the severity of social communication deficits inherent to ASD. We then used two machine learning methods, random forest classification (RF) and support vector machine (SVM), to identify two clusters using DERs. Finally, we evaluated how accurately the clusters predicted language impairment. Results: A total of 191 DERs were initially identified, and 54 of them with a fold-change >2 were selected for the pathway analysis. Cholesterol biosynthesis and metabolisms pathways appear to act as hubs that connect other trait-associated pathways to influence the severity of social communication deficits inherent to ASD. Both RF and SVM algorithms can yield a classification accuracy level >90% when all 191 DERs were analyzed. The ASD subtypes defined by the presence of language impairment, a strong indicator for prognosis, can be predicted by transcriptomic profiles associated with social communication deficits and cholesterol biosynthesis and metabolism. Conclusion: The results suggest that both RF and SVM are acceptable options for machine learning algorithms to identify AD subgroups characterized by clinical homogeneity related to prognosis. Frontiers Media S.A. 2021-05-12 /pmc/articles/PMC8149626/ /pubmed/34054599 http://dx.doi.org/10.3389/fpsyt.2021.637022 Text en Copyright © 2021 Lin, Moni, Gau and Eapen. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Psychiatry Lin, Ping-I Moni, Mohammad Ali Gau, Susan Shur-Fen Eapen, Valsamma Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms |
title | Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms |
title_full | Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms |
title_fullStr | Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms |
title_full_unstemmed | Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms |
title_short | Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms |
title_sort | identifying subgroups of patients with autism by gene expression profiles using machine learning algorithms |
topic | Psychiatry |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8149626/ https://www.ncbi.nlm.nih.gov/pubmed/34054599 http://dx.doi.org/10.3389/fpsyt.2021.637022 |
work_keys_str_mv | AT linpingi identifyingsubgroupsofpatientswithautismbygeneexpressionprofilesusingmachinelearningalgorithms AT monimohammadali identifyingsubgroupsofpatientswithautismbygeneexpressionprofilesusingmachinelearningalgorithms AT gaususanshurfen identifyingsubgroupsofpatientswithautismbygeneexpressionprofilesusingmachinelearningalgorithms AT eapenvalsamma identifyingsubgroupsofpatientswithautismbygeneexpressionprofilesusingmachinelearningalgorithms |