Cargando…

Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning

The complex genetic architecture of Autism Spectrum Disorder (ASD) and its heterogeneous phenotype makes molecular diagnosis and patient prognosis challenging tasks. To establish more precise genotype–phenotype correlations in ASD, we developed a novel machine-learning integrative approach, which se...

Descripción completa

Detalles Bibliográficos
Autores principales: Asif, Muhammad, Martiniano, Hugo F. M. C., Marques, Ana Rita, Santos, João Xavier, Vilela, Joana, Rasga, Celia, Oliveira, Guiomar, Couto, Francisco M., Vicente, Astrid M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7026098/
https://www.ncbi.nlm.nih.gov/pubmed/32066720
http://dx.doi.org/10.1038/s41398-020-0721-1
_version_ 1783498619436924928
author Asif, Muhammad
Martiniano, Hugo F. M. C.
Marques, Ana Rita
Santos, João Xavier
Vilela, Joana
Rasga, Celia
Oliveira, Guiomar
Couto, Francisco M.
Vicente, Astrid M.
author_facet Asif, Muhammad
Martiniano, Hugo F. M. C.
Marques, Ana Rita
Santos, João Xavier
Vilela, Joana
Rasga, Celia
Oliveira, Guiomar
Couto, Francisco M.
Vicente, Astrid M.
author_sort Asif, Muhammad
collection PubMed
description The complex genetic architecture of Autism Spectrum Disorder (ASD) and its heterogeneous phenotype makes molecular diagnosis and patient prognosis challenging tasks. To establish more precise genotype–phenotype correlations in ASD, we developed a novel machine-learning integrative approach, which seeks to delineate associations between patients’ clinical profiles and disrupted biological processes, inferred from their copy number variants (CNVs) that span brain genes. Clustering analysis of the relevant clinical measures from 2446 ASD cases in the Autism Genome Project identified two distinct phenotypic subgroups. Patients in these clusters differed significantly in ADOS-defined severity, adaptive behavior profiles, intellectual ability, and verbal status, the latter contributing the most for cluster stability and cohesion. Functional enrichment analysis of brain genes disrupted by CNVs in these ASD cases identified 15 statistically significant biological processes, including cell adhesion, neural development, cognition, and polyubiquitination, in line with previous ASD findings. A Naive Bayes classifier, generated to predict the ASD phenotypic clusters from disrupted biological processes, achieved predictions with a high precision (0.82) but low recall (0.39), for a subset of patients with higher biological Information Content scores. This study shows that milder and more severe clinical presentations can have distinct underlying biological mechanisms. It further highlights how machine-learning approaches can reduce clinical heterogeneity by using multidimensional clinical measures, and establishes genotype–phenotype correlations in ASD. However, predictions are strongly dependent on patient’s information content. Findings are therefore a first step toward the translation of genetic information into clinically useful applications, and emphasize the need for larger datasets with very complete clinical and biological information.
format Online
Article
Text
id pubmed-7026098
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-70260982020-03-03 Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning Asif, Muhammad Martiniano, Hugo F. M. C. Marques, Ana Rita Santos, João Xavier Vilela, Joana Rasga, Celia Oliveira, Guiomar Couto, Francisco M. Vicente, Astrid M. Transl Psychiatry Article The complex genetic architecture of Autism Spectrum Disorder (ASD) and its heterogeneous phenotype makes molecular diagnosis and patient prognosis challenging tasks. To establish more precise genotype–phenotype correlations in ASD, we developed a novel machine-learning integrative approach, which seeks to delineate associations between patients’ clinical profiles and disrupted biological processes, inferred from their copy number variants (CNVs) that span brain genes. Clustering analysis of the relevant clinical measures from 2446 ASD cases in the Autism Genome Project identified two distinct phenotypic subgroups. Patients in these clusters differed significantly in ADOS-defined severity, adaptive behavior profiles, intellectual ability, and verbal status, the latter contributing the most for cluster stability and cohesion. Functional enrichment analysis of brain genes disrupted by CNVs in these ASD cases identified 15 statistically significant biological processes, including cell adhesion, neural development, cognition, and polyubiquitination, in line with previous ASD findings. A Naive Bayes classifier, generated to predict the ASD phenotypic clusters from disrupted biological processes, achieved predictions with a high precision (0.82) but low recall (0.39), for a subset of patients with higher biological Information Content scores. This study shows that milder and more severe clinical presentations can have distinct underlying biological mechanisms. It further highlights how machine-learning approaches can reduce clinical heterogeneity by using multidimensional clinical measures, and establishes genotype–phenotype correlations in ASD. However, predictions are strongly dependent on patient’s information content. Findings are therefore a first step toward the translation of genetic information into clinically useful applications, and emphasize the need for larger datasets with very complete clinical and biological information. Nature Publishing Group UK 2020-01-28 /pmc/articles/PMC7026098/ /pubmed/32066720 http://dx.doi.org/10.1038/s41398-020-0721-1 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Asif, Muhammad
Martiniano, Hugo F. M. C.
Marques, Ana Rita
Santos, João Xavier
Vilela, Joana
Rasga, Celia
Oliveira, Guiomar
Couto, Francisco M.
Vicente, Astrid M.
Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_full Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_fullStr Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_full_unstemmed Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_short Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning
title_sort identification of biological mechanisms underlying a multidimensional asd phenotype using machine learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7026098/
https://www.ncbi.nlm.nih.gov/pubmed/32066720
http://dx.doi.org/10.1038/s41398-020-0721-1
work_keys_str_mv AT asifmuhammad identificationofbiologicalmechanismsunderlyingamultidimensionalasdphenotypeusingmachinelearning
AT martinianohugofmc identificationofbiologicalmechanismsunderlyingamultidimensionalasdphenotypeusingmachinelearning
AT marquesanarita identificationofbiologicalmechanismsunderlyingamultidimensionalasdphenotypeusingmachinelearning
AT santosjoaoxavier identificationofbiologicalmechanismsunderlyingamultidimensionalasdphenotypeusingmachinelearning
AT vilelajoana identificationofbiologicalmechanismsunderlyingamultidimensionalasdphenotypeusingmachinelearning
AT rasgacelia identificationofbiologicalmechanismsunderlyingamultidimensionalasdphenotypeusingmachinelearning
AT oliveiraguiomar identificationofbiologicalmechanismsunderlyingamultidimensionalasdphenotypeusingmachinelearning
AT coutofranciscom identificationofbiologicalmechanismsunderlyingamultidimensionalasdphenotypeusingmachinelearning
AT vicenteastridm identificationofbiologicalmechanismsunderlyingamultidimensionalasdphenotypeusingmachinelearning