Cargando…

A Performance Comparison of Unsupervised Techniques for Event Detection from Oscar Tweets

People's lives are influenced by social media. It is an essential source for sharing news, awareness, detecting events, people's interests, etc. Social media covers a wide range of topics and events to be discussed. Extensive work has been published to capture the interesting events and in...

Descripción completa

Detalles Bibliográficos
Autores principales: Malik, Muzamil, Aslam, Waqar, Aslam, Zahid, Alharbi, Abdullah, Alouffi, Bader, Rauf, Hafiz Tayyab
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9155953/
https://www.ncbi.nlm.nih.gov/pubmed/35655515
http://dx.doi.org/10.1155/2022/5980043
_version_ 1784718348568756224
author Malik, Muzamil
Aslam, Waqar
Aslam, Zahid
Alharbi, Abdullah
Alouffi, Bader
Rauf, Hafiz Tayyab
author_facet Malik, Muzamil
Aslam, Waqar
Aslam, Zahid
Alharbi, Abdullah
Alouffi, Bader
Rauf, Hafiz Tayyab
author_sort Malik, Muzamil
collection PubMed
description People's lives are influenced by social media. It is an essential source for sharing news, awareness, detecting events, people's interests, etc. Social media covers a wide range of topics and events to be discussed. Extensive work has been published to capture the interesting events and insights from datasets. Many techniques are presented to detect events from social media networks like Twitter. In text mining, most of the work is done on a specific dataset, and there is the need to present some new datasets to analyse the performance and generic nature of Topic Detection and Tracking methods. Therefore, this paper publishes a dataset of real-life event, the Oscars 2018, gathered from Twitter and makes a comparison of soft frequent pattern mining (SFPM), singular value decomposition and k-means (K-SVD), feature-pivot (Feat-p), document-pivot (Doc-p), and latent Dirichlet allocation (LDA). The dataset contains 2,160,738 tweets collected using some seed words. Only English tweets are considered. All of the methods applied in this paper are unsupervised. This area needs to be explored on different datasets. The Oscars 2018 is evaluated using keyword precision (K-Prec), keyword recall (K-Rec), and topic recall (T-Rec) for detecting events of greater interest. The highest K-Prec, K-Rec, and T-Rec were achieved by SFPM, but they started to decrease as the number of clusters increased. The lowest performance was achieved by Feat-p in terms of all three metrics. Experiments on the Oscars 2018 dataset demonstrated that all the methods are generic in nature and produce meaningful clusters.
format Online
Article
Text
id pubmed-9155953
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-91559532022-06-01 A Performance Comparison of Unsupervised Techniques for Event Detection from Oscar Tweets Malik, Muzamil Aslam, Waqar Aslam, Zahid Alharbi, Abdullah Alouffi, Bader Rauf, Hafiz Tayyab Comput Intell Neurosci Research Article People's lives are influenced by social media. It is an essential source for sharing news, awareness, detecting events, people's interests, etc. Social media covers a wide range of topics and events to be discussed. Extensive work has been published to capture the interesting events and insights from datasets. Many techniques are presented to detect events from social media networks like Twitter. In text mining, most of the work is done on a specific dataset, and there is the need to present some new datasets to analyse the performance and generic nature of Topic Detection and Tracking methods. Therefore, this paper publishes a dataset of real-life event, the Oscars 2018, gathered from Twitter and makes a comparison of soft frequent pattern mining (SFPM), singular value decomposition and k-means (K-SVD), feature-pivot (Feat-p), document-pivot (Doc-p), and latent Dirichlet allocation (LDA). The dataset contains 2,160,738 tweets collected using some seed words. Only English tweets are considered. All of the methods applied in this paper are unsupervised. This area needs to be explored on different datasets. The Oscars 2018 is evaluated using keyword precision (K-Prec), keyword recall (K-Rec), and topic recall (T-Rec) for detecting events of greater interest. The highest K-Prec, K-Rec, and T-Rec were achieved by SFPM, but they started to decrease as the number of clusters increased. The lowest performance was achieved by Feat-p in terms of all three metrics. Experiments on the Oscars 2018 dataset demonstrated that all the methods are generic in nature and produce meaningful clusters. Hindawi 2022-05-24 /pmc/articles/PMC9155953/ /pubmed/35655515 http://dx.doi.org/10.1155/2022/5980043 Text en Copyright © 2022 Muzamil Malik et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Malik, Muzamil
Aslam, Waqar
Aslam, Zahid
Alharbi, Abdullah
Alouffi, Bader
Rauf, Hafiz Tayyab
A Performance Comparison of Unsupervised Techniques for Event Detection from Oscar Tweets
title A Performance Comparison of Unsupervised Techniques for Event Detection from Oscar Tweets
title_full A Performance Comparison of Unsupervised Techniques for Event Detection from Oscar Tweets
title_fullStr A Performance Comparison of Unsupervised Techniques for Event Detection from Oscar Tweets
title_full_unstemmed A Performance Comparison of Unsupervised Techniques for Event Detection from Oscar Tweets
title_short A Performance Comparison of Unsupervised Techniques for Event Detection from Oscar Tweets
title_sort performance comparison of unsupervised techniques for event detection from oscar tweets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9155953/
https://www.ncbi.nlm.nih.gov/pubmed/35655515
http://dx.doi.org/10.1155/2022/5980043
work_keys_str_mv AT malikmuzamil aperformancecomparisonofunsupervisedtechniquesforeventdetectionfromoscartweets
AT aslamwaqar aperformancecomparisonofunsupervisedtechniquesforeventdetectionfromoscartweets
AT aslamzahid aperformancecomparisonofunsupervisedtechniquesforeventdetectionfromoscartweets
AT alharbiabdullah aperformancecomparisonofunsupervisedtechniquesforeventdetectionfromoscartweets
AT alouffibader aperformancecomparisonofunsupervisedtechniquesforeventdetectionfromoscartweets
AT raufhafiztayyab aperformancecomparisonofunsupervisedtechniquesforeventdetectionfromoscartweets
AT malikmuzamil performancecomparisonofunsupervisedtechniquesforeventdetectionfromoscartweets
AT aslamwaqar performancecomparisonofunsupervisedtechniquesforeventdetectionfromoscartweets
AT aslamzahid performancecomparisonofunsupervisedtechniquesforeventdetectionfromoscartweets
AT alharbiabdullah performancecomparisonofunsupervisedtechniquesforeventdetectionfromoscartweets
AT alouffibader performancecomparisonofunsupervisedtechniquesforeventdetectionfromoscartweets
AT raufhafiztayyab performancecomparisonofunsupervisedtechniquesforeventdetectionfromoscartweets