Cargando…

Objectively Quantifying Pediatric Psychiatric Severity Using Artificial Intelligence, Voice Recognition Technology, and Universal Emotions: Pilot Study for Artificial Intelligence-Enabled Innovation to Address Youth Mental Health Crisis

BACKGROUND: Providing Psychotherapy, particularly for youth, is a pressing challenge in the health care system. Traditional methods are resource-intensive, and there is a need for objective benchmarks to guide therapeutic interventions. Automated emotion detection from speech, using artificial intel...

Descripción completa

Detalles Bibliográficos
Autores principales: Caulley, Desmond, Alemu, Yared, Burson, Sedara, Cárdenas Bautista, Elizabeth, Abebe Tadesse, Girmaw, Kottmyer, Christopher, Aeschbach, Laurent, Cheungvivatpant, Bryan, Sezgin, Emre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10628686/
https://www.ncbi.nlm.nih.gov/pubmed/37870890
http://dx.doi.org/10.2196/51912
_version_ 1785131811928539136
author Caulley, Desmond
Alemu, Yared
Burson, Sedara
Cárdenas Bautista, Elizabeth
Abebe Tadesse, Girmaw
Kottmyer, Christopher
Aeschbach, Laurent
Cheungvivatpant, Bryan
Sezgin, Emre
author_facet Caulley, Desmond
Alemu, Yared
Burson, Sedara
Cárdenas Bautista, Elizabeth
Abebe Tadesse, Girmaw
Kottmyer, Christopher
Aeschbach, Laurent
Cheungvivatpant, Bryan
Sezgin, Emre
author_sort Caulley, Desmond
collection PubMed
description BACKGROUND: Providing Psychotherapy, particularly for youth, is a pressing challenge in the health care system. Traditional methods are resource-intensive, and there is a need for objective benchmarks to guide therapeutic interventions. Automated emotion detection from speech, using artificial intelligence, presents an emerging approach to address these challenges. Speech can carry vital information about emotional states, which can be used to improve mental health care services, especially when the person is suffering. OBJECTIVE: This study aims to develop and evaluate automated methods for detecting the intensity of emotions (anger, fear, sadness, and happiness) in audio recordings of patients’ speech. We also demonstrate the viability of deploying the models. Our model was validated in a previous publication by Alemu et al with limited voice samples. This follow-up study used significantly more voice samples to validate the previous model. METHODS: We used audio recordings of patients, specifically children with high adverse childhood experience (ACE) scores; the average ACE score was 5 or higher, at the highest risk for chronic disease and social or emotional problems; only 1 in 6 have a score of 4 or above. The patients’ structured voice sample was collected by reading a fixed script. In total, 4 highly trained therapists classified audio segments based on a scoring process of 4 emotions and their intensity levels for each of the 4 different emotions. We experimented with various preprocessing methods, including denoising, voice-activity detection, and diarization. Additionally, we explored various model architectures, including convolutional neural networks (CNNs) and transformers. We trained emotion-specific transformer-based models and a generalized CNN-based model to predict emotion intensities. RESULTS: The emotion-specific transformer-based model achieved a test-set precision and recall of 86% and 79%, respectively, for binary emotional intensity classification (high or low). In contrast, the CNN-based model, generalized to predict the intensity of 4 different emotions, achieved test-set precision and recall of 83% for each. CONCLUSIONS: Automated emotion detection from patients’ speech using artificial intelligence models is found to be feasible, leading to a high level of accuracy. The transformer-based model exhibited better performance in emotion-specific detection, while the CNN-based model showed promise in generalized emotion detection. These models can serve as valuable decision-support tools for pediatricians and mental health providers to triage youth to appropriate levels of mental health care services. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR1-10.2196/51912
format Online
Article
Text
id pubmed-10628686
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-106286862023-11-08 Objectively Quantifying Pediatric Psychiatric Severity Using Artificial Intelligence, Voice Recognition Technology, and Universal Emotions: Pilot Study for Artificial Intelligence-Enabled Innovation to Address Youth Mental Health Crisis Caulley, Desmond Alemu, Yared Burson, Sedara Cárdenas Bautista, Elizabeth Abebe Tadesse, Girmaw Kottmyer, Christopher Aeschbach, Laurent Cheungvivatpant, Bryan Sezgin, Emre JMIR Res Protoc Protocol BACKGROUND: Providing Psychotherapy, particularly for youth, is a pressing challenge in the health care system. Traditional methods are resource-intensive, and there is a need for objective benchmarks to guide therapeutic interventions. Automated emotion detection from speech, using artificial intelligence, presents an emerging approach to address these challenges. Speech can carry vital information about emotional states, which can be used to improve mental health care services, especially when the person is suffering. OBJECTIVE: This study aims to develop and evaluate automated methods for detecting the intensity of emotions (anger, fear, sadness, and happiness) in audio recordings of patients’ speech. We also demonstrate the viability of deploying the models. Our model was validated in a previous publication by Alemu et al with limited voice samples. This follow-up study used significantly more voice samples to validate the previous model. METHODS: We used audio recordings of patients, specifically children with high adverse childhood experience (ACE) scores; the average ACE score was 5 or higher, at the highest risk for chronic disease and social or emotional problems; only 1 in 6 have a score of 4 or above. The patients’ structured voice sample was collected by reading a fixed script. In total, 4 highly trained therapists classified audio segments based on a scoring process of 4 emotions and their intensity levels for each of the 4 different emotions. We experimented with various preprocessing methods, including denoising, voice-activity detection, and diarization. Additionally, we explored various model architectures, including convolutional neural networks (CNNs) and transformers. We trained emotion-specific transformer-based models and a generalized CNN-based model to predict emotion intensities. RESULTS: The emotion-specific transformer-based model achieved a test-set precision and recall of 86% and 79%, respectively, for binary emotional intensity classification (high or low). In contrast, the CNN-based model, generalized to predict the intensity of 4 different emotions, achieved test-set precision and recall of 83% for each. CONCLUSIONS: Automated emotion detection from patients’ speech using artificial intelligence models is found to be feasible, leading to a high level of accuracy. The transformer-based model exhibited better performance in emotion-specific detection, while the CNN-based model showed promise in generalized emotion detection. These models can serve as valuable decision-support tools for pediatricians and mental health providers to triage youth to appropriate levels of mental health care services. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR1-10.2196/51912 JMIR Publications 2023-10-23 /pmc/articles/PMC10628686/ /pubmed/37870890 http://dx.doi.org/10.2196/51912 Text en ©Desmond Caulley, Yared Alemu, Sedara Burson, Elizabeth Cárdenas Bautista, Girmaw Abebe Tadesse, Christopher Kottmyer, Laurent Aeschbach, Bryan Cheungvivatpant, Emre Sezgin. Originally published in JMIR Research Protocols (https://www.researchprotocols.org), 23.10.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on https://www.researchprotocols.org, as well as this copyright and license information must be included.
spellingShingle Protocol
Caulley, Desmond
Alemu, Yared
Burson, Sedara
Cárdenas Bautista, Elizabeth
Abebe Tadesse, Girmaw
Kottmyer, Christopher
Aeschbach, Laurent
Cheungvivatpant, Bryan
Sezgin, Emre
Objectively Quantifying Pediatric Psychiatric Severity Using Artificial Intelligence, Voice Recognition Technology, and Universal Emotions: Pilot Study for Artificial Intelligence-Enabled Innovation to Address Youth Mental Health Crisis
title Objectively Quantifying Pediatric Psychiatric Severity Using Artificial Intelligence, Voice Recognition Technology, and Universal Emotions: Pilot Study for Artificial Intelligence-Enabled Innovation to Address Youth Mental Health Crisis
title_full Objectively Quantifying Pediatric Psychiatric Severity Using Artificial Intelligence, Voice Recognition Technology, and Universal Emotions: Pilot Study for Artificial Intelligence-Enabled Innovation to Address Youth Mental Health Crisis
title_fullStr Objectively Quantifying Pediatric Psychiatric Severity Using Artificial Intelligence, Voice Recognition Technology, and Universal Emotions: Pilot Study for Artificial Intelligence-Enabled Innovation to Address Youth Mental Health Crisis
title_full_unstemmed Objectively Quantifying Pediatric Psychiatric Severity Using Artificial Intelligence, Voice Recognition Technology, and Universal Emotions: Pilot Study for Artificial Intelligence-Enabled Innovation to Address Youth Mental Health Crisis
title_short Objectively Quantifying Pediatric Psychiatric Severity Using Artificial Intelligence, Voice Recognition Technology, and Universal Emotions: Pilot Study for Artificial Intelligence-Enabled Innovation to Address Youth Mental Health Crisis
title_sort objectively quantifying pediatric psychiatric severity using artificial intelligence, voice recognition technology, and universal emotions: pilot study for artificial intelligence-enabled innovation to address youth mental health crisis
topic Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10628686/
https://www.ncbi.nlm.nih.gov/pubmed/37870890
http://dx.doi.org/10.2196/51912
work_keys_str_mv AT caulleydesmond objectivelyquantifyingpediatricpsychiatricseverityusingartificialintelligencevoicerecognitiontechnologyanduniversalemotionspilotstudyforartificialintelligenceenabledinnovationtoaddressyouthmentalhealthcrisis
AT alemuyared objectivelyquantifyingpediatricpsychiatricseverityusingartificialintelligencevoicerecognitiontechnologyanduniversalemotionspilotstudyforartificialintelligenceenabledinnovationtoaddressyouthmentalhealthcrisis
AT bursonsedara objectivelyquantifyingpediatricpsychiatricseverityusingartificialintelligencevoicerecognitiontechnologyanduniversalemotionspilotstudyforartificialintelligenceenabledinnovationtoaddressyouthmentalhealthcrisis
AT cardenasbautistaelizabeth objectivelyquantifyingpediatricpsychiatricseverityusingartificialintelligencevoicerecognitiontechnologyanduniversalemotionspilotstudyforartificialintelligenceenabledinnovationtoaddressyouthmentalhealthcrisis
AT abebetadessegirmaw objectivelyquantifyingpediatricpsychiatricseverityusingartificialintelligencevoicerecognitiontechnologyanduniversalemotionspilotstudyforartificialintelligenceenabledinnovationtoaddressyouthmentalhealthcrisis
AT kottmyerchristopher objectivelyquantifyingpediatricpsychiatricseverityusingartificialintelligencevoicerecognitiontechnologyanduniversalemotionspilotstudyforartificialintelligenceenabledinnovationtoaddressyouthmentalhealthcrisis
AT aeschbachlaurent objectivelyquantifyingpediatricpsychiatricseverityusingartificialintelligencevoicerecognitiontechnologyanduniversalemotionspilotstudyforartificialintelligenceenabledinnovationtoaddressyouthmentalhealthcrisis
AT cheungvivatpantbryan objectivelyquantifyingpediatricpsychiatricseverityusingartificialintelligencevoicerecognitiontechnologyanduniversalemotionspilotstudyforartificialintelligenceenabledinnovationtoaddressyouthmentalhealthcrisis
AT sezginemre objectivelyquantifyingpediatricpsychiatricseverityusingartificialintelligencevoicerecognitiontechnologyanduniversalemotionspilotstudyforartificialintelligenceenabledinnovationtoaddressyouthmentalhealthcrisis