Cargando…

A Multimodal Approach to Improve Performance Evaluation of Call Center Agent

The paper proposes three modeling techniques to improve the performance evaluation of the call center agent. The first technique is speech processing supported by an attention layer for the agent’s recorded calls. The speech comprises 65 features for the ultimate determination of the context of the...

Descripción completa

Detalles Bibliográficos
Autores principales: Ahmed, Abdelrahman, Shaalan, Khaled, Toral, Sergio, Hifny, Yasser
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8069216/
https://www.ncbi.nlm.nih.gov/pubmed/33921549
http://dx.doi.org/10.3390/s21082720
_version_ 1783683185114087424
author Ahmed, Abdelrahman
Shaalan, Khaled
Toral, Sergio
Hifny, Yasser
author_facet Ahmed, Abdelrahman
Shaalan, Khaled
Toral, Sergio
Hifny, Yasser
author_sort Ahmed, Abdelrahman
collection PubMed
description The paper proposes three modeling techniques to improve the performance evaluation of the call center agent. The first technique is speech processing supported by an attention layer for the agent’s recorded calls. The speech comprises 65 features for the ultimate determination of the context of the call using the Open-Smile toolkit. The second technique uses the Max Weights Similarity (MWS) approach instead of the Softmax function in the attention layer to improve the classification accuracy. MWS function replaces the Softmax function for fine-tuning the output of the attention layer for processing text. It is formed by determining the similarity in the distance of input weights of the attention layer to the weights of the max vectors. The third technique combines the agent’s recorded call speech with the corresponding transcribed text for binary classification. The speech modeling and text modeling are based on combinations of the Convolutional Neural Networks (CNNs) and Bi-directional Long-Short Term Memory (BiLSTMs). In this paper, the classification results for each model (text versus speech) are proposed and compared with the multimodal approach’s results. The multimodal classification provided an improvement of (0.22%) compared with acoustic model and (1.7%) compared with text model.
format Online
Article
Text
id pubmed-8069216
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-80692162021-04-26 A Multimodal Approach to Improve Performance Evaluation of Call Center Agent Ahmed, Abdelrahman Shaalan, Khaled Toral, Sergio Hifny, Yasser Sensors (Basel) Communication The paper proposes three modeling techniques to improve the performance evaluation of the call center agent. The first technique is speech processing supported by an attention layer for the agent’s recorded calls. The speech comprises 65 features for the ultimate determination of the context of the call using the Open-Smile toolkit. The second technique uses the Max Weights Similarity (MWS) approach instead of the Softmax function in the attention layer to improve the classification accuracy. MWS function replaces the Softmax function for fine-tuning the output of the attention layer for processing text. It is formed by determining the similarity in the distance of input weights of the attention layer to the weights of the max vectors. The third technique combines the agent’s recorded call speech with the corresponding transcribed text for binary classification. The speech modeling and text modeling are based on combinations of the Convolutional Neural Networks (CNNs) and Bi-directional Long-Short Term Memory (BiLSTMs). In this paper, the classification results for each model (text versus speech) are proposed and compared with the multimodal approach’s results. The multimodal classification provided an improvement of (0.22%) compared with acoustic model and (1.7%) compared with text model. MDPI 2021-04-12 /pmc/articles/PMC8069216/ /pubmed/33921549 http://dx.doi.org/10.3390/s21082720 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Communication
Ahmed, Abdelrahman
Shaalan, Khaled
Toral, Sergio
Hifny, Yasser
A Multimodal Approach to Improve Performance Evaluation of Call Center Agent
title A Multimodal Approach to Improve Performance Evaluation of Call Center Agent
title_full A Multimodal Approach to Improve Performance Evaluation of Call Center Agent
title_fullStr A Multimodal Approach to Improve Performance Evaluation of Call Center Agent
title_full_unstemmed A Multimodal Approach to Improve Performance Evaluation of Call Center Agent
title_short A Multimodal Approach to Improve Performance Evaluation of Call Center Agent
title_sort multimodal approach to improve performance evaluation of call center agent
topic Communication
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8069216/
https://www.ncbi.nlm.nih.gov/pubmed/33921549
http://dx.doi.org/10.3390/s21082720
work_keys_str_mv AT ahmedabdelrahman amultimodalapproachtoimproveperformanceevaluationofcallcenteragent
AT shaalankhaled amultimodalapproachtoimproveperformanceevaluationofcallcenteragent
AT toralsergio amultimodalapproachtoimproveperformanceevaluationofcallcenteragent
AT hifnyyasser amultimodalapproachtoimproveperformanceevaluationofcallcenteragent
AT ahmedabdelrahman multimodalapproachtoimproveperformanceevaluationofcallcenteragent
AT shaalankhaled multimodalapproachtoimproveperformanceevaluationofcallcenteragent
AT toralsergio multimodalapproachtoimproveperformanceevaluationofcallcenteragent
AT hifnyyasser multimodalapproachtoimproveperformanceevaluationofcallcenteragent