Cargando…

A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services

INTRODUCTION: Streaming services are highly popular today. Millions of people watch live streams or videos and listen to music. METHODS: One of the most popular streaming platforms is Twitch, and data from this type of service can be a good example for applying the parallel DBSCAN algorithm proposed...

Descripción completa

Detalles Bibliográficos
Autores principales: Mochurad, Lesia, Sydor, Andrii, Ratinskiy, Oleh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10644222/
https://www.ncbi.nlm.nih.gov/pubmed/38025944
http://dx.doi.org/10.3389/fdata.2023.1292923
_version_ 1785134507949555712
author Mochurad, Lesia
Sydor, Andrii
Ratinskiy, Oleh
author_facet Mochurad, Lesia
Sydor, Andrii
Ratinskiy, Oleh
author_sort Mochurad, Lesia
collection PubMed
description INTRODUCTION: Streaming services are highly popular today. Millions of people watch live streams or videos and listen to music. METHODS: One of the most popular streaming platforms is Twitch, and data from this type of service can be a good example for applying the parallel DBSCAN algorithm proposed in this paper. Unlike the classical approach to neighbor search, the proposed one avoids redundancy, i.e., the repetition of the same calculations. At the same time, this algorithm is based on the classical DBSCAN method with a full search for all neighbors, parallelization by subtasks, and OpenMP parallel computing technology. RESULTS: In this work, without reducing the accuracy, we managed to speed up the solution based on the DBSCAN algorithm when analyzing medium-sized data. As a result, the acceleration rate tends to the number of cores of a multicore computer system and the efficiency to one. DISCUSSION: Before conducting numerical experiments, theoretical estimates of speed-up and efficiency were obtained, and they aligned with the results obtained, confirming their validity. The quality of the performed clustering was verified using the silhouette value. All experiments were conducted using different percentages of medium-sized datasets. The prospects of applying the proposed algorithm can be obtained in various fields such as advertising, marketing, cybersecurity, and sociology. It is worth mentioning that datasets of this kind are often used for detecting fraud on the Internet, making an algorithm capable of considering all neighbors a useful tool for such research.
format Online
Article
Text
id pubmed-10644222
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-106442222023-10-31 A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services Mochurad, Lesia Sydor, Andrii Ratinskiy, Oleh Front Big Data Big Data INTRODUCTION: Streaming services are highly popular today. Millions of people watch live streams or videos and listen to music. METHODS: One of the most popular streaming platforms is Twitch, and data from this type of service can be a good example for applying the parallel DBSCAN algorithm proposed in this paper. Unlike the classical approach to neighbor search, the proposed one avoids redundancy, i.e., the repetition of the same calculations. At the same time, this algorithm is based on the classical DBSCAN method with a full search for all neighbors, parallelization by subtasks, and OpenMP parallel computing technology. RESULTS: In this work, without reducing the accuracy, we managed to speed up the solution based on the DBSCAN algorithm when analyzing medium-sized data. As a result, the acceleration rate tends to the number of cores of a multicore computer system and the efficiency to one. DISCUSSION: Before conducting numerical experiments, theoretical estimates of speed-up and efficiency were obtained, and they aligned with the results obtained, confirming their validity. The quality of the performed clustering was verified using the silhouette value. All experiments were conducted using different percentages of medium-sized datasets. The prospects of applying the proposed algorithm can be obtained in various fields such as advertising, marketing, cybersecurity, and sociology. It is worth mentioning that datasets of this kind are often used for detecting fraud on the Internet, making an algorithm capable of considering all neighbors a useful tool for such research. Frontiers Media S.A. 2023-10-31 /pmc/articles/PMC10644222/ /pubmed/38025944 http://dx.doi.org/10.3389/fdata.2023.1292923 Text en Copyright © 2023 Mochurad, Sydor and Ratinskiy. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Big Data
Mochurad, Lesia
Sydor, Andrii
Ratinskiy, Oleh
A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services
title A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services
title_full A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services
title_fullStr A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services
title_full_unstemmed A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services
title_short A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services
title_sort fast parallelized dbscan algorithm based on openmp for detection of criminals on streaming services
topic Big Data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10644222/
https://www.ncbi.nlm.nih.gov/pubmed/38025944
http://dx.doi.org/10.3389/fdata.2023.1292923
work_keys_str_mv AT mochuradlesia afastparallelizeddbscanalgorithmbasedonopenmpfordetectionofcriminalsonstreamingservices
AT sydorandrii afastparallelizeddbscanalgorithmbasedonopenmpfordetectionofcriminalsonstreamingservices
AT ratinskiyoleh afastparallelizeddbscanalgorithmbasedonopenmpfordetectionofcriminalsonstreamingservices
AT mochuradlesia fastparallelizeddbscanalgorithmbasedonopenmpfordetectionofcriminalsonstreamingservices
AT sydorandrii fastparallelizeddbscanalgorithmbasedonopenmpfordetectionofcriminalsonstreamingservices
AT ratinskiyoleh fastparallelizeddbscanalgorithmbasedonopenmpfordetectionofcriminalsonstreamingservices