Cargando…

Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model

Sarcasm emerges as a common phenomenon across social networking sites because people express their negative thoughts, hatred and opinions using positive vocabulary which makes it a challenging task to detect sarcasm. Although various studies have investigated the sarcasm detection on baseline datase...

Descripción completa

Detalles Bibliográficos
Autores principales: Jamil, Ramish, Ashraf, Imran, Rustam, Furqan, Saad, Eysha, Mehmood, Arif, Choi, Gyu Sang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8409330/
https://www.ncbi.nlm.nih.gov/pubmed/34541306
http://dx.doi.org/10.7717/peerj-cs.645
_version_ 1783746976382189568
author Jamil, Ramish
Ashraf, Imran
Rustam, Furqan
Saad, Eysha
Mehmood, Arif
Choi, Gyu Sang
author_facet Jamil, Ramish
Ashraf, Imran
Rustam, Furqan
Saad, Eysha
Mehmood, Arif
Choi, Gyu Sang
author_sort Jamil, Ramish
collection PubMed
description Sarcasm emerges as a common phenomenon across social networking sites because people express their negative thoughts, hatred and opinions using positive vocabulary which makes it a challenging task to detect sarcasm. Although various studies have investigated the sarcasm detection on baseline datasets, this work is the first to detect sarcasm from a multi-domain dataset that is constructed by combining Twitter and News Headlines datasets. This study proposes a hybrid approach where the convolutional neural networks (CNN) are used for feature extraction while the long short-term memory (LSTM) is trained and tested on those features. For performance analysis, several machine learning algorithms such as random forest, support vector classifier, extra tree classifier and decision tree are used. The performance of both the proposed model and machine learning algorithms is analyzed using the term frequency-inverse document frequency, bag of words approach, and global vectors for word representations. Experimental results indicate that the proposed model surpasses the performance of the traditional machine learning algorithms with an accuracy of 91.60%. Several state-of-the-art approaches for sarcasm detection are compared with the proposed model and results suggest that the proposed model outperforms these approaches concerning the precision, recall and F1 scores. The proposed model is accurate, robust, and performs sarcasm detection on a multi-domain dataset.
format Online
Article
Text
id pubmed-8409330
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-84093302021-09-17 Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model Jamil, Ramish Ashraf, Imran Rustam, Furqan Saad, Eysha Mehmood, Arif Choi, Gyu Sang PeerJ Comput Sci Artificial Intelligence Sarcasm emerges as a common phenomenon across social networking sites because people express their negative thoughts, hatred and opinions using positive vocabulary which makes it a challenging task to detect sarcasm. Although various studies have investigated the sarcasm detection on baseline datasets, this work is the first to detect sarcasm from a multi-domain dataset that is constructed by combining Twitter and News Headlines datasets. This study proposes a hybrid approach where the convolutional neural networks (CNN) are used for feature extraction while the long short-term memory (LSTM) is trained and tested on those features. For performance analysis, several machine learning algorithms such as random forest, support vector classifier, extra tree classifier and decision tree are used. The performance of both the proposed model and machine learning algorithms is analyzed using the term frequency-inverse document frequency, bag of words approach, and global vectors for word representations. Experimental results indicate that the proposed model surpasses the performance of the traditional machine learning algorithms with an accuracy of 91.60%. Several state-of-the-art approaches for sarcasm detection are compared with the proposed model and results suggest that the proposed model outperforms these approaches concerning the precision, recall and F1 scores. The proposed model is accurate, robust, and performs sarcasm detection on a multi-domain dataset. PeerJ Inc. 2021-08-25 /pmc/articles/PMC8409330/ /pubmed/34541306 http://dx.doi.org/10.7717/peerj-cs.645 Text en © 2021 Jamil et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Artificial Intelligence
Jamil, Ramish
Ashraf, Imran
Rustam, Furqan
Saad, Eysha
Mehmood, Arif
Choi, Gyu Sang
Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model
title Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model
title_full Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model
title_fullStr Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model
title_full_unstemmed Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model
title_short Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model
title_sort detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8409330/
https://www.ncbi.nlm.nih.gov/pubmed/34541306
http://dx.doi.org/10.7717/peerj-cs.645
work_keys_str_mv AT jamilramish detectingsarcasminmultidomaindatasetsusingconvolutionalneuralnetworksandlongshorttermmemorynetworkmodel
AT ashrafimran detectingsarcasminmultidomaindatasetsusingconvolutionalneuralnetworksandlongshorttermmemorynetworkmodel
AT rustamfurqan detectingsarcasminmultidomaindatasetsusingconvolutionalneuralnetworksandlongshorttermmemorynetworkmodel
AT saadeysha detectingsarcasminmultidomaindatasetsusingconvolutionalneuralnetworksandlongshorttermmemorynetworkmodel
AT mehmoodarif detectingsarcasminmultidomaindatasetsusingconvolutionalneuralnetworksandlongshorttermmemorynetworkmodel
AT choigyusang detectingsarcasminmultidomaindatasetsusingconvolutionalneuralnetworksandlongshorttermmemorynetworkmodel