Cargando…
A Multimodal Approach to Improve Performance Evaluation of Call Center Agent
The paper proposes three modeling techniques to improve the performance evaluation of the call center agent. The first technique is speech processing supported by an attention layer for the agent’s recorded calls. The speech comprises 65 features for the ultimate determination of the context of the...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8069216/ https://www.ncbi.nlm.nih.gov/pubmed/33921549 http://dx.doi.org/10.3390/s21082720 |
_version_ | 1783683185114087424 |
---|---|
author | Ahmed, Abdelrahman Shaalan, Khaled Toral, Sergio Hifny, Yasser |
author_facet | Ahmed, Abdelrahman Shaalan, Khaled Toral, Sergio Hifny, Yasser |
author_sort | Ahmed, Abdelrahman |
collection | PubMed |
description | The paper proposes three modeling techniques to improve the performance evaluation of the call center agent. The first technique is speech processing supported by an attention layer for the agent’s recorded calls. The speech comprises 65 features for the ultimate determination of the context of the call using the Open-Smile toolkit. The second technique uses the Max Weights Similarity (MWS) approach instead of the Softmax function in the attention layer to improve the classification accuracy. MWS function replaces the Softmax function for fine-tuning the output of the attention layer for processing text. It is formed by determining the similarity in the distance of input weights of the attention layer to the weights of the max vectors. The third technique combines the agent’s recorded call speech with the corresponding transcribed text for binary classification. The speech modeling and text modeling are based on combinations of the Convolutional Neural Networks (CNNs) and Bi-directional Long-Short Term Memory (BiLSTMs). In this paper, the classification results for each model (text versus speech) are proposed and compared with the multimodal approach’s results. The multimodal classification provided an improvement of (0.22%) compared with acoustic model and (1.7%) compared with text model. |
format | Online Article Text |
id | pubmed-8069216 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-80692162021-04-26 A Multimodal Approach to Improve Performance Evaluation of Call Center Agent Ahmed, Abdelrahman Shaalan, Khaled Toral, Sergio Hifny, Yasser Sensors (Basel) Communication The paper proposes three modeling techniques to improve the performance evaluation of the call center agent. The first technique is speech processing supported by an attention layer for the agent’s recorded calls. The speech comprises 65 features for the ultimate determination of the context of the call using the Open-Smile toolkit. The second technique uses the Max Weights Similarity (MWS) approach instead of the Softmax function in the attention layer to improve the classification accuracy. MWS function replaces the Softmax function for fine-tuning the output of the attention layer for processing text. It is formed by determining the similarity in the distance of input weights of the attention layer to the weights of the max vectors. The third technique combines the agent’s recorded call speech with the corresponding transcribed text for binary classification. The speech modeling and text modeling are based on combinations of the Convolutional Neural Networks (CNNs) and Bi-directional Long-Short Term Memory (BiLSTMs). In this paper, the classification results for each model (text versus speech) are proposed and compared with the multimodal approach’s results. The multimodal classification provided an improvement of (0.22%) compared with acoustic model and (1.7%) compared with text model. MDPI 2021-04-12 /pmc/articles/PMC8069216/ /pubmed/33921549 http://dx.doi.org/10.3390/s21082720 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Communication Ahmed, Abdelrahman Shaalan, Khaled Toral, Sergio Hifny, Yasser A Multimodal Approach to Improve Performance Evaluation of Call Center Agent |
title | A Multimodal Approach to Improve Performance Evaluation of Call Center Agent |
title_full | A Multimodal Approach to Improve Performance Evaluation of Call Center Agent |
title_fullStr | A Multimodal Approach to Improve Performance Evaluation of Call Center Agent |
title_full_unstemmed | A Multimodal Approach to Improve Performance Evaluation of Call Center Agent |
title_short | A Multimodal Approach to Improve Performance Evaluation of Call Center Agent |
title_sort | multimodal approach to improve performance evaluation of call center agent |
topic | Communication |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8069216/ https://www.ncbi.nlm.nih.gov/pubmed/33921549 http://dx.doi.org/10.3390/s21082720 |
work_keys_str_mv | AT ahmedabdelrahman amultimodalapproachtoimproveperformanceevaluationofcallcenteragent AT shaalankhaled amultimodalapproachtoimproveperformanceevaluationofcallcenteragent AT toralsergio amultimodalapproachtoimproveperformanceevaluationofcallcenteragent AT hifnyyasser amultimodalapproachtoimproveperformanceevaluationofcallcenteragent AT ahmedabdelrahman multimodalapproachtoimproveperformanceevaluationofcallcenteragent AT shaalankhaled multimodalapproachtoimproveperformanceevaluationofcallcenteragent AT toralsergio multimodalapproachtoimproveperformanceevaluationofcallcenteragent AT hifnyyasser multimodalapproachtoimproveperformanceevaluationofcallcenteragent |