Cargando…

Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study

BACKGROUND: Autism spectrum disorder (ASD) is a neurodevelopmental disorder that results in altered behavior, social development, and communication patterns. In recent years, autism prevalence has tripled, with 1 in 44 children now affected. Given that traditional diagnosis is a lengthy, labor-inten...

Descripción completa

Detalles Bibliográficos
Autores principales: Chi, Nathan A, Washington, Peter, Kline, Aaron, Husic, Arman, Hou, Cathy, He, Chloe, Dunlap, Kaitlyn, Wall, Dennis P
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9052034/
https://www.ncbi.nlm.nih.gov/pubmed/35436234
http://dx.doi.org/10.2196/35406
_version_ 1784696697631277056
author Chi, Nathan A
Washington, Peter
Kline, Aaron
Husic, Arman
Hou, Cathy
He, Chloe
Dunlap, Kaitlyn
Wall, Dennis P
author_facet Chi, Nathan A
Washington, Peter
Kline, Aaron
Husic, Arman
Hou, Cathy
He, Chloe
Dunlap, Kaitlyn
Wall, Dennis P
author_sort Chi, Nathan A
collection PubMed
description BACKGROUND: Autism spectrum disorder (ASD) is a neurodevelopmental disorder that results in altered behavior, social development, and communication patterns. In recent years, autism prevalence has tripled, with 1 in 44 children now affected. Given that traditional diagnosis is a lengthy, labor-intensive process that requires the work of trained physicians, significant attention has been given to developing systems that automatically detect autism. We work toward this goal by analyzing audio data, as prosody abnormalities are a signal of autism, with affected children displaying speech idiosyncrasies such as echolalia, monotonous intonation, atypical pitch, and irregular linguistic stress patterns. OBJECTIVE: We aimed to test the ability for machine learning approaches to aid in detection of autism in self-recorded speech audio captured from children with ASD and neurotypical (NT) children in their home environments. METHODS: We considered three methods to detect autism in child speech: (1) random forests trained on extracted audio features (including Mel-frequency cepstral coefficients); (2) convolutional neural networks trained on spectrograms; and (3) fine-tuned wav2vec 2.0—a state-of-the-art transformer-based speech recognition model. We trained our classifiers on our novel data set of cellphone-recorded child speech audio curated from the Guess What? mobile game, an app designed to crowdsource videos of children with ASD and NT children in a natural home environment. RESULTS: The random forest classifier achieved 70% accuracy, the fine-tuned wav2vec 2.0 model achieved 77% accuracy, and the convolutional neural network achieved 79% accuracy when classifying children’s audio as either ASD or NT. We used 5-fold cross-validation to evaluate model performance. CONCLUSIONS: Our models were able to predict autism status when trained on a varied selection of home audio clips with inconsistent recording qualities, which may be more representative of real-world conditions. The results demonstrate that machine learning methods offer promise in detecting autism automatically from speech without specialized equipment.
format Online
Article
Text
id pubmed-9052034
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-90520342022-04-30 Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study Chi, Nathan A Washington, Peter Kline, Aaron Husic, Arman Hou, Cathy He, Chloe Dunlap, Kaitlyn Wall, Dennis P JMIR Pediatr Parent Original Paper BACKGROUND: Autism spectrum disorder (ASD) is a neurodevelopmental disorder that results in altered behavior, social development, and communication patterns. In recent years, autism prevalence has tripled, with 1 in 44 children now affected. Given that traditional diagnosis is a lengthy, labor-intensive process that requires the work of trained physicians, significant attention has been given to developing systems that automatically detect autism. We work toward this goal by analyzing audio data, as prosody abnormalities are a signal of autism, with affected children displaying speech idiosyncrasies such as echolalia, monotonous intonation, atypical pitch, and irregular linguistic stress patterns. OBJECTIVE: We aimed to test the ability for machine learning approaches to aid in detection of autism in self-recorded speech audio captured from children with ASD and neurotypical (NT) children in their home environments. METHODS: We considered three methods to detect autism in child speech: (1) random forests trained on extracted audio features (including Mel-frequency cepstral coefficients); (2) convolutional neural networks trained on spectrograms; and (3) fine-tuned wav2vec 2.0—a state-of-the-art transformer-based speech recognition model. We trained our classifiers on our novel data set of cellphone-recorded child speech audio curated from the Guess What? mobile game, an app designed to crowdsource videos of children with ASD and NT children in a natural home environment. RESULTS: The random forest classifier achieved 70% accuracy, the fine-tuned wav2vec 2.0 model achieved 77% accuracy, and the convolutional neural network achieved 79% accuracy when classifying children’s audio as either ASD or NT. We used 5-fold cross-validation to evaluate model performance. CONCLUSIONS: Our models were able to predict autism status when trained on a varied selection of home audio clips with inconsistent recording qualities, which may be more representative of real-world conditions. The results demonstrate that machine learning methods offer promise in detecting autism automatically from speech without specialized equipment. JMIR Publications 2022-04-14 /pmc/articles/PMC9052034/ /pubmed/35436234 http://dx.doi.org/10.2196/35406 Text en ©Nathan A Chi, Peter Washington, Aaron Kline, Arman Husic, Cathy Hou, Chloe He, Kaitlyn Dunlap, Dennis P Wall. Originally published in JMIR Pediatrics and Parenting (https://pediatrics.jmir.org), 14.04.2022. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Pediatrics and Parenting, is properly cited. The complete bibliographic information, a link to the original publication on https://pediatrics.jmir.org, as well as this copyright and license information must be included.
spellingShingle Original Paper
Chi, Nathan A
Washington, Peter
Kline, Aaron
Husic, Arman
Hou, Cathy
He, Chloe
Dunlap, Kaitlyn
Wall, Dennis P
Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study
title Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study
title_full Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study
title_fullStr Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study
title_full_unstemmed Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study
title_short Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study
title_sort classifying autism from crowdsourced semistructured speech recordings: machine learning model comparison study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9052034/
https://www.ncbi.nlm.nih.gov/pubmed/35436234
http://dx.doi.org/10.2196/35406
work_keys_str_mv AT chinathana classifyingautismfromcrowdsourcedsemistructuredspeechrecordingsmachinelearningmodelcomparisonstudy
AT washingtonpeter classifyingautismfromcrowdsourcedsemistructuredspeechrecordingsmachinelearningmodelcomparisonstudy
AT klineaaron classifyingautismfromcrowdsourcedsemistructuredspeechrecordingsmachinelearningmodelcomparisonstudy
AT husicarman classifyingautismfromcrowdsourcedsemistructuredspeechrecordingsmachinelearningmodelcomparisonstudy
AT houcathy classifyingautismfromcrowdsourcedsemistructuredspeechrecordingsmachinelearningmodelcomparisonstudy
AT hechloe classifyingautismfromcrowdsourcedsemistructuredspeechrecordingsmachinelearningmodelcomparisonstudy
AT dunlapkaitlyn classifyingautismfromcrowdsourcedsemistructuredspeechrecordingsmachinelearningmodelcomparisonstudy
AT walldennisp classifyingautismfromcrowdsourcedsemistructuredspeechrecordingsmachinelearningmodelcomparisonstudy