Cargando…

Machine learning based analysis for intellectual disability in Down syndrome

Down syndrome (DS) or trisomy 21 is the most common genetic cause of intellectual disability (ID), but a pathogenic mechanism has not been identified yet. Studying a complex and not monogenic condition such as DS, a clear correlation between cause and effect might be difficult to find through classi...

Descripción completa

Detalles Bibliográficos
Autores principales: Baldo, Federico, Piovesan, Allison, Rakvin, Marijana, Ramacieri, Giuseppe, Locatelli, Chiara, Lanfranchi, Silvia, Onnivello, Sara, Pulina, Francesca, Caracausi, Maria, Antonaros, Francesca, Lombardi, Michele, Pelleri, Maria Chiara
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10558609/
https://www.ncbi.nlm.nih.gov/pubmed/37810082
http://dx.doi.org/10.1016/j.heliyon.2023.e19444
_version_ 1785117315599171584
author Baldo, Federico
Piovesan, Allison
Rakvin, Marijana
Ramacieri, Giuseppe
Locatelli, Chiara
Lanfranchi, Silvia
Onnivello, Sara
Pulina, Francesca
Caracausi, Maria
Antonaros, Francesca
Lombardi, Michele
Pelleri, Maria Chiara
author_facet Baldo, Federico
Piovesan, Allison
Rakvin, Marijana
Ramacieri, Giuseppe
Locatelli, Chiara
Lanfranchi, Silvia
Onnivello, Sara
Pulina, Francesca
Caracausi, Maria
Antonaros, Francesca
Lombardi, Michele
Pelleri, Maria Chiara
author_sort Baldo, Federico
collection PubMed
description Down syndrome (DS) or trisomy 21 is the most common genetic cause of intellectual disability (ID), but a pathogenic mechanism has not been identified yet. Studying a complex and not monogenic condition such as DS, a clear correlation between cause and effect might be difficult to find through classical analysis methods, thus different approaches need to be used. The increased availability of big data has made the use of artificial intelligence (AI) and in particular machine learning (ML) in the medical field possible. The purpose of this work is the application of ML techniques to provide an analysis of clinical records obtained from subjects with DS and study their association with ID. We have applied two tree-based ML models (random forest and gradient boosting machine) to the research question: how to identify key features likely associated with ID in DS. We analyzed 109 features (or variables) in 106 DS subjects. The outcome of the analysis was the age equivalent (AE) score as indicator of intellectual functioning, impaired in ID. We applied several methods to configure the models: feature selection through Boruta framework to minimize random correlation; data augmentation to overcome the issue of a small dataset; age effect mitigation to take into account the chronological age of the subjects. The results show that ML algorithms can be applied with good accuracy to identify variables likely involved in cognitive impairment in DS. In particular, we show how random forest and gradient boosting machine produce results with low error (MSE <0.12) and an acceptable R(2) (0.70 and 0.93). Interestingly, the ranking of the variables point to several features of interest related to hearing, gastrointestinal alterations, thyroid state, immune system and vitamin B12 that can be considered with particular attention for improving care pathways for people with DS. In conclusion, ML-based model may assist researchers in identifying key features likely correlated with ID in DS, and ultimately, may improve research efforts focused on the identification of possible therapeutic targets and new care pathways. We believe this study can be the basis for further testing/validating of our algorithms with multiple and larger datasets.
format Online
Article
Text
id pubmed-10558609
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-105586092023-10-08 Machine learning based analysis for intellectual disability in Down syndrome Baldo, Federico Piovesan, Allison Rakvin, Marijana Ramacieri, Giuseppe Locatelli, Chiara Lanfranchi, Silvia Onnivello, Sara Pulina, Francesca Caracausi, Maria Antonaros, Francesca Lombardi, Michele Pelleri, Maria Chiara Heliyon Research Article Down syndrome (DS) or trisomy 21 is the most common genetic cause of intellectual disability (ID), but a pathogenic mechanism has not been identified yet. Studying a complex and not monogenic condition such as DS, a clear correlation between cause and effect might be difficult to find through classical analysis methods, thus different approaches need to be used. The increased availability of big data has made the use of artificial intelligence (AI) and in particular machine learning (ML) in the medical field possible. The purpose of this work is the application of ML techniques to provide an analysis of clinical records obtained from subjects with DS and study their association with ID. We have applied two tree-based ML models (random forest and gradient boosting machine) to the research question: how to identify key features likely associated with ID in DS. We analyzed 109 features (or variables) in 106 DS subjects. The outcome of the analysis was the age equivalent (AE) score as indicator of intellectual functioning, impaired in ID. We applied several methods to configure the models: feature selection through Boruta framework to minimize random correlation; data augmentation to overcome the issue of a small dataset; age effect mitigation to take into account the chronological age of the subjects. The results show that ML algorithms can be applied with good accuracy to identify variables likely involved in cognitive impairment in DS. In particular, we show how random forest and gradient boosting machine produce results with low error (MSE <0.12) and an acceptable R(2) (0.70 and 0.93). Interestingly, the ranking of the variables point to several features of interest related to hearing, gastrointestinal alterations, thyroid state, immune system and vitamin B12 that can be considered with particular attention for improving care pathways for people with DS. In conclusion, ML-based model may assist researchers in identifying key features likely correlated with ID in DS, and ultimately, may improve research efforts focused on the identification of possible therapeutic targets and new care pathways. We believe this study can be the basis for further testing/validating of our algorithms with multiple and larger datasets. Elsevier 2023-08-27 /pmc/articles/PMC10558609/ /pubmed/37810082 http://dx.doi.org/10.1016/j.heliyon.2023.e19444 Text en © 2023 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Baldo, Federico
Piovesan, Allison
Rakvin, Marijana
Ramacieri, Giuseppe
Locatelli, Chiara
Lanfranchi, Silvia
Onnivello, Sara
Pulina, Francesca
Caracausi, Maria
Antonaros, Francesca
Lombardi, Michele
Pelleri, Maria Chiara
Machine learning based analysis for intellectual disability in Down syndrome
title Machine learning based analysis for intellectual disability in Down syndrome
title_full Machine learning based analysis for intellectual disability in Down syndrome
title_fullStr Machine learning based analysis for intellectual disability in Down syndrome
title_full_unstemmed Machine learning based analysis for intellectual disability in Down syndrome
title_short Machine learning based analysis for intellectual disability in Down syndrome
title_sort machine learning based analysis for intellectual disability in down syndrome
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10558609/
https://www.ncbi.nlm.nih.gov/pubmed/37810082
http://dx.doi.org/10.1016/j.heliyon.2023.e19444
work_keys_str_mv AT baldofederico machinelearningbasedanalysisforintellectualdisabilityindownsyndrome
AT piovesanallison machinelearningbasedanalysisforintellectualdisabilityindownsyndrome
AT rakvinmarijana machinelearningbasedanalysisforintellectualdisabilityindownsyndrome
AT ramacierigiuseppe machinelearningbasedanalysisforintellectualdisabilityindownsyndrome
AT locatellichiara machinelearningbasedanalysisforintellectualdisabilityindownsyndrome
AT lanfranchisilvia machinelearningbasedanalysisforintellectualdisabilityindownsyndrome
AT onnivellosara machinelearningbasedanalysisforintellectualdisabilityindownsyndrome
AT pulinafrancesca machinelearningbasedanalysisforintellectualdisabilityindownsyndrome
AT caracausimaria machinelearningbasedanalysisforintellectualdisabilityindownsyndrome
AT antonarosfrancesca machinelearningbasedanalysisforintellectualdisabilityindownsyndrome
AT lombardimichele machinelearningbasedanalysisforintellectualdisabilityindownsyndrome
AT pellerimariachiara machinelearningbasedanalysisforintellectualdisabilityindownsyndrome