Cargando…
Depression Speech Recognition With a Three-Dimensional Convolutional Network
Depression has become one of the main afflictions that threaten people's mental health. However, the current traditional diagnosis methods have certain limitations, so it is necessary to find a method of objective evaluation of depression based on intelligent technology to assist in the early d...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8514878/ https://www.ncbi.nlm.nih.gov/pubmed/34658815 http://dx.doi.org/10.3389/fnhum.2021.713823 |
_version_ | 1784583492243292160 |
---|---|
author | Wang, Hongbo Liu, Yu Zhen, Xiaoxiao Tu, Xuyan |
author_facet | Wang, Hongbo Liu, Yu Zhen, Xiaoxiao Tu, Xuyan |
author_sort | Wang, Hongbo |
collection | PubMed |
description | Depression has become one of the main afflictions that threaten people's mental health. However, the current traditional diagnosis methods have certain limitations, so it is necessary to find a method of objective evaluation of depression based on intelligent technology to assist in the early diagnosis and treatment of patients. Because the abnormal speech features of patients with depression are related to their mental state to some extent, it is valuable to use speech acoustic features as objective indicators for the diagnosis of depression. In order to solve the problem of the complexity of speech in depression and the limited performance of traditional feature extraction methods for speech signals, this article suggests a Three-Dimensional Convolutional filter bank with Highway Networks and Bidirectional GRU (Gated Recurrent Unit) with an Attention mechanism (in short 3D-CBHGA), which includes two key strategies. (1) The three-dimensional feature extraction of the speech signal can timely realize the expression ability of those depression signals. (2) Based on the attention mechanism in the GRU network, the frame-level vector is weighted to get the hidden emotion vector by self-learning. Experiments show that the proposed 3D-CBHGA can well establish mapping from speech signals to depression-related features and improve the accuracy of depression detection in speech signals. |
format | Online Article Text |
id | pubmed-8514878 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-85148782021-10-15 Depression Speech Recognition With a Three-Dimensional Convolutional Network Wang, Hongbo Liu, Yu Zhen, Xiaoxiao Tu, Xuyan Front Hum Neurosci Human Neuroscience Depression has become one of the main afflictions that threaten people's mental health. However, the current traditional diagnosis methods have certain limitations, so it is necessary to find a method of objective evaluation of depression based on intelligent technology to assist in the early diagnosis and treatment of patients. Because the abnormal speech features of patients with depression are related to their mental state to some extent, it is valuable to use speech acoustic features as objective indicators for the diagnosis of depression. In order to solve the problem of the complexity of speech in depression and the limited performance of traditional feature extraction methods for speech signals, this article suggests a Three-Dimensional Convolutional filter bank with Highway Networks and Bidirectional GRU (Gated Recurrent Unit) with an Attention mechanism (in short 3D-CBHGA), which includes two key strategies. (1) The three-dimensional feature extraction of the speech signal can timely realize the expression ability of those depression signals. (2) Based on the attention mechanism in the GRU network, the frame-level vector is weighted to get the hidden emotion vector by self-learning. Experiments show that the proposed 3D-CBHGA can well establish mapping from speech signals to depression-related features and improve the accuracy of depression detection in speech signals. Frontiers Media S.A. 2021-09-30 /pmc/articles/PMC8514878/ /pubmed/34658815 http://dx.doi.org/10.3389/fnhum.2021.713823 Text en Copyright © 2021 Wang, Liu, Zhen and Tu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Human Neuroscience Wang, Hongbo Liu, Yu Zhen, Xiaoxiao Tu, Xuyan Depression Speech Recognition With a Three-Dimensional Convolutional Network |
title | Depression Speech Recognition With a Three-Dimensional Convolutional Network |
title_full | Depression Speech Recognition With a Three-Dimensional Convolutional Network |
title_fullStr | Depression Speech Recognition With a Three-Dimensional Convolutional Network |
title_full_unstemmed | Depression Speech Recognition With a Three-Dimensional Convolutional Network |
title_short | Depression Speech Recognition With a Three-Dimensional Convolutional Network |
title_sort | depression speech recognition with a three-dimensional convolutional network |
topic | Human Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8514878/ https://www.ncbi.nlm.nih.gov/pubmed/34658815 http://dx.doi.org/10.3389/fnhum.2021.713823 |
work_keys_str_mv | AT wanghongbo depressionspeechrecognitionwithathreedimensionalconvolutionalnetwork AT liuyu depressionspeechrecognitionwithathreedimensionalconvolutionalnetwork AT zhenxiaoxiao depressionspeechrecognitionwithathreedimensionalconvolutionalnetwork AT tuxuyan depressionspeechrecognitionwithathreedimensionalconvolutionalnetwork |