Cargando…

A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates

Autism spectrum disorder (ASD) is a complex neurodevelopmental condition with a strong genetic basis. The role of de novo mutations in ASD has been well established, but the set of genes implicated to date is still far from complete. The current study employs a machine learning-based approach to pre...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Ying, Afshar, Shiva, Rajadhyaksha, Anjali M., Potash, James B., Han, Shizhong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7513695/
https://www.ncbi.nlm.nih.gov/pubmed/33133139
http://dx.doi.org/10.3389/fgene.2020.500064
_version_ 1783586434934898688
author Lin, Ying
Afshar, Shiva
Rajadhyaksha, Anjali M.
Potash, James B.
Han, Shizhong
author_facet Lin, Ying
Afshar, Shiva
Rajadhyaksha, Anjali M.
Potash, James B.
Han, Shizhong
author_sort Lin, Ying
collection PubMed
description Autism spectrum disorder (ASD) is a complex neurodevelopmental condition with a strong genetic basis. The role of de novo mutations in ASD has been well established, but the set of genes implicated to date is still far from complete. The current study employs a machine learning-based approach to predict ASD risk genes using features from spatiotemporal gene expression patterns in human brain, gene-level constraint metrics, and other gene variation features. The genes identified through our prediction model were enriched for independent sets of ASD risk genes, and tended to be down-expressed in ASD brains, especially in frontal and parietal cortex. The highest-ranked genes not only included those with strong prior evidence for involvement in ASD (for example, NBEA, HERC1, and TCF20), but also indicated potentially novel candidates, such as, MYCBP2 and CAND1, which are involved in protein ubiquitination. We also showed that our method outperformed state-of-the-art scoring systems for ranking curated ASD candidate genes. Gene ontology enrichment analysis of our predicted risk genes revealed biological processes clearly relevant to ASD, including neuronal signaling, neurogenesis, and chromatin remodeling, but also highlighted other potential mechanisms that might underlie ASD, such as regulation of RNA alternative splicing and ubiquitination pathway related to protein degradation. Our study demonstrates that human brain spatiotemporal gene expression patterns and gene-level constraint metrics can help predict ASD risk genes. Our gene ranking system provides a useful resource for prioritizing ASD candidate genes.
format Online
Article
Text
id pubmed-7513695
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-75136952020-10-30 A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates Lin, Ying Afshar, Shiva Rajadhyaksha, Anjali M. Potash, James B. Han, Shizhong Front Genet Genetics Autism spectrum disorder (ASD) is a complex neurodevelopmental condition with a strong genetic basis. The role of de novo mutations in ASD has been well established, but the set of genes implicated to date is still far from complete. The current study employs a machine learning-based approach to predict ASD risk genes using features from spatiotemporal gene expression patterns in human brain, gene-level constraint metrics, and other gene variation features. The genes identified through our prediction model were enriched for independent sets of ASD risk genes, and tended to be down-expressed in ASD brains, especially in frontal and parietal cortex. The highest-ranked genes not only included those with strong prior evidence for involvement in ASD (for example, NBEA, HERC1, and TCF20), but also indicated potentially novel candidates, such as, MYCBP2 and CAND1, which are involved in protein ubiquitination. We also showed that our method outperformed state-of-the-art scoring systems for ranking curated ASD candidate genes. Gene ontology enrichment analysis of our predicted risk genes revealed biological processes clearly relevant to ASD, including neuronal signaling, neurogenesis, and chromatin remodeling, but also highlighted other potential mechanisms that might underlie ASD, such as regulation of RNA alternative splicing and ubiquitination pathway related to protein degradation. Our study demonstrates that human brain spatiotemporal gene expression patterns and gene-level constraint metrics can help predict ASD risk genes. Our gene ranking system provides a useful resource for prioritizing ASD candidate genes. Frontiers Media S.A. 2020-09-10 /pmc/articles/PMC7513695/ /pubmed/33133139 http://dx.doi.org/10.3389/fgene.2020.500064 Text en Copyright © 2020 Lin, Afshar, Rajadhyaksha, Potash and Han. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Lin, Ying
Afshar, Shiva
Rajadhyaksha, Anjali M.
Potash, James B.
Han, Shizhong
A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates
title A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates
title_full A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates
title_fullStr A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates
title_full_unstemmed A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates
title_short A Machine Learning Approach to Predicting Autism Risk Genes: Validation of Known Genes and Discovery of New Candidates
title_sort machine learning approach to predicting autism risk genes: validation of known genes and discovery of new candidates
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7513695/
https://www.ncbi.nlm.nih.gov/pubmed/33133139
http://dx.doi.org/10.3389/fgene.2020.500064
work_keys_str_mv AT linying amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates
AT afsharshiva amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates
AT rajadhyakshaanjalim amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates
AT potashjamesb amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates
AT hanshizhong amachinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates
AT linying machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates
AT afsharshiva machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates
AT rajadhyakshaanjalim machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates
AT potashjamesb machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates
AT hanshizhong machinelearningapproachtopredictingautismriskgenesvalidationofknowngenesanddiscoveryofnewcandidates