Cargando…

Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network

Every human being has emotion for every item related to them. For every customer, their emotion can help the customer representative to understand their requirement. So, speech emotion recognition plays an important role in the interaction between humans. Now, the intelligent system can help to impr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Puri, Tanvi, Soni, Mukesh, Dhiman, Gaurav, Ibrahim Khalaf, Osamah, alazzam, Malik, Raza Khan, Ihtiram
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8898841/ https://www.ncbi.nlm.nih.gov/pubmed/35265307 http://dx.doi.org/10.1155/2022/8472947

_version_	1784663758164983808
author	Puri, Tanvi Soni, Mukesh Dhiman, Gaurav Ibrahim Khalaf, Osamah alazzam, Malik Raza Khan, Ihtiram
author_facet	Puri, Tanvi Soni, Mukesh Dhiman, Gaurav Ibrahim Khalaf, Osamah alazzam, Malik Raza Khan, Ihtiram
author_sort	Puri, Tanvi
collection	PubMed
description	Every human being has emotion for every item related to them. For every customer, their emotion can help the customer representative to understand their requirement. So, speech emotion recognition plays an important role in the interaction between humans. Now, the intelligent system can help to improve the performance for which we design the convolution neural network (CNN) based network that can classify emotions in different categories like positive, negative, or more specific. In this paper, we use the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) audio records. The Log Mel Spectrogram and Mel-Frequency Cepstral Coefficients (MFCCs) were used to feature the raw audio file. These properties were used in the classification of emotions using techniques, such as Long Short-Term Memory (LSTM), CNNs, Hidden Markov models (HMMs), and Deep Neural Networks (DNNs). For this paper, we have divided the emotions into three sections for males and females. In the first section, we divide the emotion into two classes as positive. In the second section, we divide the emotion into three classes such as positive, negative, and neutral. In the third section, we divide the emotions into 8 different classes such as happy, sad, angry, fearful, surprise, disgust expressions, calm, and fearful emotions. For these three sections, we proposed the model which contains the eight consecutive layers of the 2D convolution neural method. The purposed model gives the better-performed categories to other previously given models. Now, we can identify the emotion of the consumer in better ways.
format	Online Article Text
id	pubmed-8898841
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-88988412022-03-08 Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network Puri, Tanvi Soni, Mukesh Dhiman, Gaurav Ibrahim Khalaf, Osamah alazzam, Malik Raza Khan, Ihtiram J Healthc Eng Research Article Every human being has emotion for every item related to them. For every customer, their emotion can help the customer representative to understand their requirement. So, speech emotion recognition plays an important role in the interaction between humans. Now, the intelligent system can help to improve the performance for which we design the convolution neural network (CNN) based network that can classify emotions in different categories like positive, negative, or more specific. In this paper, we use the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) audio records. The Log Mel Spectrogram and Mel-Frequency Cepstral Coefficients (MFCCs) were used to feature the raw audio file. These properties were used in the classification of emotions using techniques, such as Long Short-Term Memory (LSTM), CNNs, Hidden Markov models (HMMs), and Deep Neural Networks (DNNs). For this paper, we have divided the emotions into three sections for males and females. In the first section, we divide the emotion into two classes as positive. In the second section, we divide the emotion into three classes such as positive, negative, and neutral. In the third section, we divide the emotions into 8 different classes such as happy, sad, angry, fearful, surprise, disgust expressions, calm, and fearful emotions. For these three sections, we proposed the model which contains the eight consecutive layers of the 2D convolution neural method. The purposed model gives the better-performed categories to other previously given models. Now, we can identify the emotion of the consumer in better ways. Hindawi 2022-02-27 /pmc/articles/PMC8898841/ /pubmed/35265307 http://dx.doi.org/10.1155/2022/8472947 Text en Copyright © 2022 Tanvi Puri et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Puri, Tanvi Soni, Mukesh Dhiman, Gaurav Ibrahim Khalaf, Osamah alazzam, Malik Raza Khan, Ihtiram Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network
title	Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network
title_full	Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network
title_fullStr	Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network
title_full_unstemmed	Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network
title_short	Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network
title_sort	detection of emotion of speech for ravdess audio using hybrid convolution neural network
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8898841/ https://www.ncbi.nlm.nih.gov/pubmed/35265307 http://dx.doi.org/10.1155/2022/8472947
work_keys_str_mv	AT puritanvi detectionofemotionofspeechforravdessaudiousinghybridconvolutionneuralnetwork AT sonimukesh detectionofemotionofspeechforravdessaudiousinghybridconvolutionneuralnetwork AT dhimangaurav detectionofemotionofspeechforravdessaudiousinghybridconvolutionneuralnetwork AT ibrahimkhalafosamah detectionofemotionofspeechforravdessaudiousinghybridconvolutionneuralnetwork AT alazzammalik detectionofemotionofspeechforravdessaudiousinghybridconvolutionneuralnetwork AT razakhanihtiram detectionofemotionofspeechforravdessaudiousinghybridconvolutionneuralnetwork

Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network

Ejemplares similares