Cargando…
A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services
INTRODUCTION: Streaming services are highly popular today. Millions of people watch live streams or videos and listen to music. METHODS: One of the most popular streaming platforms is Twitch, and data from this type of service can be a good example for applying the parallel DBSCAN algorithm proposed...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10644222/ https://www.ncbi.nlm.nih.gov/pubmed/38025944 http://dx.doi.org/10.3389/fdata.2023.1292923 |
_version_ | 1785134507949555712 |
---|---|
author | Mochurad, Lesia Sydor, Andrii Ratinskiy, Oleh |
author_facet | Mochurad, Lesia Sydor, Andrii Ratinskiy, Oleh |
author_sort | Mochurad, Lesia |
collection | PubMed |
description | INTRODUCTION: Streaming services are highly popular today. Millions of people watch live streams or videos and listen to music. METHODS: One of the most popular streaming platforms is Twitch, and data from this type of service can be a good example for applying the parallel DBSCAN algorithm proposed in this paper. Unlike the classical approach to neighbor search, the proposed one avoids redundancy, i.e., the repetition of the same calculations. At the same time, this algorithm is based on the classical DBSCAN method with a full search for all neighbors, parallelization by subtasks, and OpenMP parallel computing technology. RESULTS: In this work, without reducing the accuracy, we managed to speed up the solution based on the DBSCAN algorithm when analyzing medium-sized data. As a result, the acceleration rate tends to the number of cores of a multicore computer system and the efficiency to one. DISCUSSION: Before conducting numerical experiments, theoretical estimates of speed-up and efficiency were obtained, and they aligned with the results obtained, confirming their validity. The quality of the performed clustering was verified using the silhouette value. All experiments were conducted using different percentages of medium-sized datasets. The prospects of applying the proposed algorithm can be obtained in various fields such as advertising, marketing, cybersecurity, and sociology. It is worth mentioning that datasets of this kind are often used for detecting fraud on the Internet, making an algorithm capable of considering all neighbors a useful tool for such research. |
format | Online Article Text |
id | pubmed-10644222 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-106442222023-10-31 A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services Mochurad, Lesia Sydor, Andrii Ratinskiy, Oleh Front Big Data Big Data INTRODUCTION: Streaming services are highly popular today. Millions of people watch live streams or videos and listen to music. METHODS: One of the most popular streaming platforms is Twitch, and data from this type of service can be a good example for applying the parallel DBSCAN algorithm proposed in this paper. Unlike the classical approach to neighbor search, the proposed one avoids redundancy, i.e., the repetition of the same calculations. At the same time, this algorithm is based on the classical DBSCAN method with a full search for all neighbors, parallelization by subtasks, and OpenMP parallel computing technology. RESULTS: In this work, without reducing the accuracy, we managed to speed up the solution based on the DBSCAN algorithm when analyzing medium-sized data. As a result, the acceleration rate tends to the number of cores of a multicore computer system and the efficiency to one. DISCUSSION: Before conducting numerical experiments, theoretical estimates of speed-up and efficiency were obtained, and they aligned with the results obtained, confirming their validity. The quality of the performed clustering was verified using the silhouette value. All experiments were conducted using different percentages of medium-sized datasets. The prospects of applying the proposed algorithm can be obtained in various fields such as advertising, marketing, cybersecurity, and sociology. It is worth mentioning that datasets of this kind are often used for detecting fraud on the Internet, making an algorithm capable of considering all neighbors a useful tool for such research. Frontiers Media S.A. 2023-10-31 /pmc/articles/PMC10644222/ /pubmed/38025944 http://dx.doi.org/10.3389/fdata.2023.1292923 Text en Copyright © 2023 Mochurad, Sydor and Ratinskiy. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Big Data Mochurad, Lesia Sydor, Andrii Ratinskiy, Oleh A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services |
title | A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services |
title_full | A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services |
title_fullStr | A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services |
title_full_unstemmed | A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services |
title_short | A fast parallelized DBSCAN algorithm based on OpenMp for detection of criminals on streaming services |
title_sort | fast parallelized dbscan algorithm based on openmp for detection of criminals on streaming services |
topic | Big Data |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10644222/ https://www.ncbi.nlm.nih.gov/pubmed/38025944 http://dx.doi.org/10.3389/fdata.2023.1292923 |
work_keys_str_mv | AT mochuradlesia afastparallelizeddbscanalgorithmbasedonopenmpfordetectionofcriminalsonstreamingservices AT sydorandrii afastparallelizeddbscanalgorithmbasedonopenmpfordetectionofcriminalsonstreamingservices AT ratinskiyoleh afastparallelizeddbscanalgorithmbasedonopenmpfordetectionofcriminalsonstreamingservices AT mochuradlesia fastparallelizeddbscanalgorithmbasedonopenmpfordetectionofcriminalsonstreamingservices AT sydorandrii fastparallelizeddbscanalgorithmbasedonopenmpfordetectionofcriminalsonstreamingservices AT ratinskiyoleh fastparallelizeddbscanalgorithmbasedonopenmpfordetectionofcriminalsonstreamingservices |