Cargando…

Identification of newborns at risk for autism using electronic medical records and machine learning

BACKGROUND. Current approaches for early identification of individuals at high risk for autism spectrum disorder (ASD) in the general population are limited, and most ASD patients are not identified until after the age of 4. This is despite substantial evidence suggesting that early diagnosis and in...

Descripción completa

Detalles Bibliográficos
Autores principales: Rahman, Rayees, Kodesh, Arad, Levine, Stephen Z., Sandin, Sven, Reichenberg, Abraham, Schlessinger, Avner
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cambridge University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7315872/
https://www.ncbi.nlm.nih.gov/pubmed/32100657
http://dx.doi.org/10.1192/j.eurpsy.2020.17
_version_ 1783550333708926976
author Rahman, Rayees
Kodesh, Arad
Levine, Stephen Z.
Sandin, Sven
Reichenberg, Abraham
Schlessinger, Avner
author_facet Rahman, Rayees
Kodesh, Arad
Levine, Stephen Z.
Sandin, Sven
Reichenberg, Abraham
Schlessinger, Avner
author_sort Rahman, Rayees
collection PubMed
description BACKGROUND. Current approaches for early identification of individuals at high risk for autism spectrum disorder (ASD) in the general population are limited, and most ASD patients are not identified until after the age of 4. This is despite substantial evidence suggesting that early diagnosis and intervention improves developmental course and outcome. The aim of the current study was to test the ability of machine learning (ML) models applied to electronic medical records (EMRs) to predict ASD early in life, in a general population sample. METHODS. We used EMR data from a single Israeli Health Maintenance Organization, including EMR information for parents of 1,397 ASD children (ICD-9/10) and 94,741 non-ASD children born between January 1st, 1997 and December 31st, 2008. Routinely available parental sociodemographic information, parental medical histories, and prescribed medications data were used to generate features to train various ML algorithms, including multivariate logistic regression, artificial neural networks, and random forest. Prediction performance was evaluated with 10-fold cross-validation by computing the area under the receiver operating characteristic curve (AUC; C-statistic), sensitivity, specificity, accuracy, false positive rate, and precision (positive predictive value [PPV]). RESULTS. All ML models tested had similar performance. The average performance across all models had C-statistic of 0.709, sensitivity of 29.93%, specificity of 98.18%, accuracy of 95.62%, false positive rate of 1.81%, and PPV of 43.35% for predicting ASD in this dataset. CONCLUSIONS. We conclude that ML algorithms combined with EMR capture early life ASD risk as well as reveal previously unknown features to be associated with ASD-risk. Such approaches may be able to enhance the ability for accurate and efficient early detection of ASD in large populations of children.
format Online
Article
Text
id pubmed-7315872
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Cambridge University Press
record_format MEDLINE/PubMed
spelling pubmed-73158722020-07-07 Identification of newborns at risk for autism using electronic medical records and machine learning Rahman, Rayees Kodesh, Arad Levine, Stephen Z. Sandin, Sven Reichenberg, Abraham Schlessinger, Avner Eur Psychiatry Research Article BACKGROUND. Current approaches for early identification of individuals at high risk for autism spectrum disorder (ASD) in the general population are limited, and most ASD patients are not identified until after the age of 4. This is despite substantial evidence suggesting that early diagnosis and intervention improves developmental course and outcome. The aim of the current study was to test the ability of machine learning (ML) models applied to electronic medical records (EMRs) to predict ASD early in life, in a general population sample. METHODS. We used EMR data from a single Israeli Health Maintenance Organization, including EMR information for parents of 1,397 ASD children (ICD-9/10) and 94,741 non-ASD children born between January 1st, 1997 and December 31st, 2008. Routinely available parental sociodemographic information, parental medical histories, and prescribed medications data were used to generate features to train various ML algorithms, including multivariate logistic regression, artificial neural networks, and random forest. Prediction performance was evaluated with 10-fold cross-validation by computing the area under the receiver operating characteristic curve (AUC; C-statistic), sensitivity, specificity, accuracy, false positive rate, and precision (positive predictive value [PPV]). RESULTS. All ML models tested had similar performance. The average performance across all models had C-statistic of 0.709, sensitivity of 29.93%, specificity of 98.18%, accuracy of 95.62%, false positive rate of 1.81%, and PPV of 43.35% for predicting ASD in this dataset. CONCLUSIONS. We conclude that ML algorithms combined with EMR capture early life ASD risk as well as reveal previously unknown features to be associated with ASD-risk. Such approaches may be able to enhance the ability for accurate and efficient early detection of ASD in large populations of children. Cambridge University Press 2020-02-26 /pmc/articles/PMC7315872/ /pubmed/32100657 http://dx.doi.org/10.1192/j.eurpsy.2020.17 Text en © The Author(s) 2020 http://creativecommons.org/licenses/by/4.0/ http://creativecommons.org/licenses/by/4.0/This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Rahman, Rayees
Kodesh, Arad
Levine, Stephen Z.
Sandin, Sven
Reichenberg, Abraham
Schlessinger, Avner
Identification of newborns at risk for autism using electronic medical records and machine learning
title Identification of newborns at risk for autism using electronic medical records and machine learning
title_full Identification of newborns at risk for autism using electronic medical records and machine learning
title_fullStr Identification of newborns at risk for autism using electronic medical records and machine learning
title_full_unstemmed Identification of newborns at risk for autism using electronic medical records and machine learning
title_short Identification of newborns at risk for autism using electronic medical records and machine learning
title_sort identification of newborns at risk for autism using electronic medical records and machine learning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7315872/
https://www.ncbi.nlm.nih.gov/pubmed/32100657
http://dx.doi.org/10.1192/j.eurpsy.2020.17
work_keys_str_mv AT rahmanrayees identificationofnewbornsatriskforautismusingelectronicmedicalrecordsandmachinelearning
AT kodesharad identificationofnewbornsatriskforautismusingelectronicmedicalrecordsandmachinelearning
AT levinestephenz identificationofnewbornsatriskforautismusingelectronicmedicalrecordsandmachinelearning
AT sandinsven identificationofnewbornsatriskforautismusingelectronicmedicalrecordsandmachinelearning
AT reichenbergabraham identificationofnewbornsatriskforautismusingelectronicmedicalrecordsandmachinelearning
AT schlessingeravner identificationofnewbornsatriskforautismusingelectronicmedicalrecordsandmachinelearning