Cargando…

Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms

Objectives: The identification of subgroups of autism spectrum disorder (ASD) may partially remedy the problems of clinical heterogeneity to facilitate the improvement of clinical management. The current study aims to use machine learning algorithms to analyze microarray data to identify clusters wi...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Ping-I, Moni, Mohammad Ali, Gau, Susan Shur-Fen, Eapen, Valsamma
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8149626/
https://www.ncbi.nlm.nih.gov/pubmed/34054599
http://dx.doi.org/10.3389/fpsyt.2021.637022
_version_ 1783697986413395968
author Lin, Ping-I
Moni, Mohammad Ali
Gau, Susan Shur-Fen
Eapen, Valsamma
author_facet Lin, Ping-I
Moni, Mohammad Ali
Gau, Susan Shur-Fen
Eapen, Valsamma
author_sort Lin, Ping-I
collection PubMed
description Objectives: The identification of subgroups of autism spectrum disorder (ASD) may partially remedy the problems of clinical heterogeneity to facilitate the improvement of clinical management. The current study aims to use machine learning algorithms to analyze microarray data to identify clusters with relatively homogeneous clinical features. Methods: The whole-genome gene expression microarray data were used to predict communication quotient (SCQ) scores against all probes to select differential expression regions (DERs). Gene set enrichment analysis was performed for DERs with a fold-change >2 to identify hub pathways that play a role in the severity of social communication deficits inherent to ASD. We then used two machine learning methods, random forest classification (RF) and support vector machine (SVM), to identify two clusters using DERs. Finally, we evaluated how accurately the clusters predicted language impairment. Results: A total of 191 DERs were initially identified, and 54 of them with a fold-change >2 were selected for the pathway analysis. Cholesterol biosynthesis and metabolisms pathways appear to act as hubs that connect other trait-associated pathways to influence the severity of social communication deficits inherent to ASD. Both RF and SVM algorithms can yield a classification accuracy level >90% when all 191 DERs were analyzed. The ASD subtypes defined by the presence of language impairment, a strong indicator for prognosis, can be predicted by transcriptomic profiles associated with social communication deficits and cholesterol biosynthesis and metabolism. Conclusion: The results suggest that both RF and SVM are acceptable options for machine learning algorithms to identify AD subgroups characterized by clinical homogeneity related to prognosis.
format Online
Article
Text
id pubmed-8149626
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-81496262021-05-27 Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms Lin, Ping-I Moni, Mohammad Ali Gau, Susan Shur-Fen Eapen, Valsamma Front Psychiatry Psychiatry Objectives: The identification of subgroups of autism spectrum disorder (ASD) may partially remedy the problems of clinical heterogeneity to facilitate the improvement of clinical management. The current study aims to use machine learning algorithms to analyze microarray data to identify clusters with relatively homogeneous clinical features. Methods: The whole-genome gene expression microarray data were used to predict communication quotient (SCQ) scores against all probes to select differential expression regions (DERs). Gene set enrichment analysis was performed for DERs with a fold-change >2 to identify hub pathways that play a role in the severity of social communication deficits inherent to ASD. We then used two machine learning methods, random forest classification (RF) and support vector machine (SVM), to identify two clusters using DERs. Finally, we evaluated how accurately the clusters predicted language impairment. Results: A total of 191 DERs were initially identified, and 54 of them with a fold-change >2 were selected for the pathway analysis. Cholesterol biosynthesis and metabolisms pathways appear to act as hubs that connect other trait-associated pathways to influence the severity of social communication deficits inherent to ASD. Both RF and SVM algorithms can yield a classification accuracy level >90% when all 191 DERs were analyzed. The ASD subtypes defined by the presence of language impairment, a strong indicator for prognosis, can be predicted by transcriptomic profiles associated with social communication deficits and cholesterol biosynthesis and metabolism. Conclusion: The results suggest that both RF and SVM are acceptable options for machine learning algorithms to identify AD subgroups characterized by clinical homogeneity related to prognosis. Frontiers Media S.A. 2021-05-12 /pmc/articles/PMC8149626/ /pubmed/34054599 http://dx.doi.org/10.3389/fpsyt.2021.637022 Text en Copyright © 2021 Lin, Moni, Gau and Eapen. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Psychiatry
Lin, Ping-I
Moni, Mohammad Ali
Gau, Susan Shur-Fen
Eapen, Valsamma
Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms
title Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms
title_full Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms
title_fullStr Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms
title_full_unstemmed Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms
title_short Identifying Subgroups of Patients With Autism by Gene Expression Profiles Using Machine Learning Algorithms
title_sort identifying subgroups of patients with autism by gene expression profiles using machine learning algorithms
topic Psychiatry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8149626/
https://www.ncbi.nlm.nih.gov/pubmed/34054599
http://dx.doi.org/10.3389/fpsyt.2021.637022
work_keys_str_mv AT linpingi identifyingsubgroupsofpatientswithautismbygeneexpressionprofilesusingmachinelearningalgorithms
AT monimohammadali identifyingsubgroupsofpatientswithautismbygeneexpressionprofilesusingmachinelearningalgorithms
AT gaususanshurfen identifyingsubgroupsofpatientswithautismbygeneexpressionprofilesusingmachinelearningalgorithms
AT eapenvalsamma identifyingsubgroupsofpatientswithautismbygeneexpressionprofilesusingmachinelearningalgorithms