Cargando…
Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN
Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5539696/ https://www.ncbi.nlm.nih.gov/pubmed/28737705 http://dx.doi.org/10.3390/s17071694 |
_version_ | 1783254530941517824 |
---|---|
author | Zhu, Lianzhang Chen, Leiming Zhao, Dehai Zhou, Jiehan Zhang, Weishan |
author_facet | Zhu, Lianzhang Chen, Leiming Zhao, Dehai Zhou, Jiehan Zhang, Weishan |
author_sort | Zhu, Lianzhang |
collection | PubMed |
description | Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve the accuracy of speech emotion recognition, including speech signal feature extraction and emotion classification methods. Five types of features are extracted from a speech sample: mel frequency cepstrum coefficient (MFCC), pitch, formant, short-term zero-crossing rate and short-term energy. By comparing statistical features with deep features extracted by a Deep Belief Network (DBN), we attempt to find the best features to identify the emotion status for speech. We propose a novel classification method that combines DBN and SVM (support vector machine) instead of using only one of them. In addition, a conjugate gradient method is applied to train DBN in order to speed up the training process. Gender-dependent experiments are conducted using an emotional speech database created by the Chinese Academy of Sciences. The results show that DBN features can reflect emotion status better than artificial features, and our new classification approach achieves an accuracy of 95.8%, which is higher than using either DBN or SVM separately. Results also show that DBN can work very well for small training databases if it is properly designed. |
format | Online Article Text |
id | pubmed-5539696 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-55396962017-08-11 Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN Zhu, Lianzhang Chen, Leiming Zhao, Dehai Zhou, Jiehan Zhang, Weishan Sensors (Basel) Article Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve the accuracy of speech emotion recognition, including speech signal feature extraction and emotion classification methods. Five types of features are extracted from a speech sample: mel frequency cepstrum coefficient (MFCC), pitch, formant, short-term zero-crossing rate and short-term energy. By comparing statistical features with deep features extracted by a Deep Belief Network (DBN), we attempt to find the best features to identify the emotion status for speech. We propose a novel classification method that combines DBN and SVM (support vector machine) instead of using only one of them. In addition, a conjugate gradient method is applied to train DBN in order to speed up the training process. Gender-dependent experiments are conducted using an emotional speech database created by the Chinese Academy of Sciences. The results show that DBN features can reflect emotion status better than artificial features, and our new classification approach achieves an accuracy of 95.8%, which is higher than using either DBN or SVM separately. Results also show that DBN can work very well for small training databases if it is properly designed. MDPI 2017-07-24 /pmc/articles/PMC5539696/ /pubmed/28737705 http://dx.doi.org/10.3390/s17071694 Text en © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Zhu, Lianzhang Chen, Leiming Zhao, Dehai Zhou, Jiehan Zhang, Weishan Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN |
title | Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN |
title_full | Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN |
title_fullStr | Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN |
title_full_unstemmed | Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN |
title_short | Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN |
title_sort | emotion recognition from chinese speech for smart affective services using a combination of svm and dbn |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5539696/ https://www.ncbi.nlm.nih.gov/pubmed/28737705 http://dx.doi.org/10.3390/s17071694 |
work_keys_str_mv | AT zhulianzhang emotionrecognitionfromchinesespeechforsmartaffectiveservicesusingacombinationofsvmanddbn AT chenleiming emotionrecognitionfromchinesespeechforsmartaffectiveservicesusingacombinationofsvmanddbn AT zhaodehai emotionrecognitionfromchinesespeechforsmartaffectiveservicesusingacombinationofsvmanddbn AT zhoujiehan emotionrecognitionfromchinesespeechforsmartaffectiveservicesusingacombinationofsvmanddbn AT zhangweishan emotionrecognitionfromchinesespeechforsmartaffectiveservicesusingacombinationofsvmanddbn |