Cargando…

Application of interval type-2 fuzzy logic and type-1 fuzzy logic-based approaches to social networks for spam detection with combined feature capabilities

BACKGROUND: Social networks are large platforms that allow their users to interact with each other on the Internet. Today, the widespread use of social networks has made them vulnerable to malicious use through different methods such as fake accounts and spam. As a result, many social network users...

Descripción completa

Detalles Bibliográficos
Autores principales: Atacak, İsmail, Çıtlak, Oğuzhan, Doğru, İbrahim Alper
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280623/
https://www.ncbi.nlm.nih.gov/pubmed/37346510
http://dx.doi.org/10.7717/peerj-cs.1316
_version_ 1785060837747064832
author Atacak, İsmail
Çıtlak, Oğuzhan
Doğru, İbrahim Alper
author_facet Atacak, İsmail
Çıtlak, Oğuzhan
Doğru, İbrahim Alper
author_sort Atacak, İsmail
collection PubMed
description BACKGROUND: Social networks are large platforms that allow their users to interact with each other on the Internet. Today, the widespread use of social networks has made them vulnerable to malicious use through different methods such as fake accounts and spam. As a result, many social network users are exposed to the harmful effects of spam accounts created by malicious people. Although Twitter, one of the most popular social networking platforms, uses spam filters to protect its users from the harmful effects of spam, these filters are insufficient to detect spam accounts that exhibit new methods and behaviours. That’s why on social networking platforms like Twitter, it has become a necessity to use robust and more dynamic methods to detect spam accounts. METHODS: Fuzzy logic (FL) based approaches, as they are the models such that generate results by interpreting the data obtained based on heuristics viewpoint according to past experiences, they can provide robust and dynamic solutions in spam detection, as in many application areas. For this purpose, a data set was created by collecting data on the twitter platform for spam detection. In the study, fuzzy logic-based classification approaches are suggested for spam detection. In the first stage of the proposed method, a data set with extracted attributes was obtained by applying normalization and crowdsourcing approaches to the raw data obtained from Twitter. In the next stage, as a process of the data preprocessing step, six attributes in the binary form in the data set were subjected to a rating-based transformation and combined with the other real-valued attribute to create a database to be used in spam detection. Classification process inputs were obtained by applying the fisher-score method, one of the commonly used filter-based methods, to the data set obtained in the second stage. In the last stage, the data were classified based on FL based approaches according to the obtained inputs. As FL approaches, four different Mamdani and Sugeno fuzzy inference systems based on interval type-1 and Interval Type-2 were used. Finally, in the classification phase, four different machine learning (ML) approaches including support vector machine (SVM), Bayesian point machine (BPM), logistic regression (LR) and average perceptron (Avr Prc) methods were used to test the effectiveness of these approaches in detecting spam. RESULTS: Experimental results were obtained by applying different FL and ML based approaches on the data set created in the study. As a result of the experiments, the Interval Type-2 Mamdani fuzzy inference system (IT2M-FIS) provided the highest performance with an accuracy of 0.955, a recall of 0.967, an F-score 0.962 and an area under the curve (AUC) of 0.971. However, it has been observed that FL-based spam models have a higher performance than ML-based spam models in terms of metrics including accuracy, recall, F-score and AUC values.
format Online
Article
Text
id pubmed-10280623
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-102806232023-06-21 Application of interval type-2 fuzzy logic and type-1 fuzzy logic-based approaches to social networks for spam detection with combined feature capabilities Atacak, İsmail Çıtlak, Oğuzhan Doğru, İbrahim Alper PeerJ Comput Sci Algorithms and Analysis of Algorithms BACKGROUND: Social networks are large platforms that allow their users to interact with each other on the Internet. Today, the widespread use of social networks has made them vulnerable to malicious use through different methods such as fake accounts and spam. As a result, many social network users are exposed to the harmful effects of spam accounts created by malicious people. Although Twitter, one of the most popular social networking platforms, uses spam filters to protect its users from the harmful effects of spam, these filters are insufficient to detect spam accounts that exhibit new methods and behaviours. That’s why on social networking platforms like Twitter, it has become a necessity to use robust and more dynamic methods to detect spam accounts. METHODS: Fuzzy logic (FL) based approaches, as they are the models such that generate results by interpreting the data obtained based on heuristics viewpoint according to past experiences, they can provide robust and dynamic solutions in spam detection, as in many application areas. For this purpose, a data set was created by collecting data on the twitter platform for spam detection. In the study, fuzzy logic-based classification approaches are suggested for spam detection. In the first stage of the proposed method, a data set with extracted attributes was obtained by applying normalization and crowdsourcing approaches to the raw data obtained from Twitter. In the next stage, as a process of the data preprocessing step, six attributes in the binary form in the data set were subjected to a rating-based transformation and combined with the other real-valued attribute to create a database to be used in spam detection. Classification process inputs were obtained by applying the fisher-score method, one of the commonly used filter-based methods, to the data set obtained in the second stage. In the last stage, the data were classified based on FL based approaches according to the obtained inputs. As FL approaches, four different Mamdani and Sugeno fuzzy inference systems based on interval type-1 and Interval Type-2 were used. Finally, in the classification phase, four different machine learning (ML) approaches including support vector machine (SVM), Bayesian point machine (BPM), logistic regression (LR) and average perceptron (Avr Prc) methods were used to test the effectiveness of these approaches in detecting spam. RESULTS: Experimental results were obtained by applying different FL and ML based approaches on the data set created in the study. As a result of the experiments, the Interval Type-2 Mamdani fuzzy inference system (IT2M-FIS) provided the highest performance with an accuracy of 0.955, a recall of 0.967, an F-score 0.962 and an area under the curve (AUC) of 0.971. However, it has been observed that FL-based spam models have a higher performance than ML-based spam models in terms of metrics including accuracy, recall, F-score and AUC values. PeerJ Inc. 2023-04-21 /pmc/articles/PMC10280623/ /pubmed/37346510 http://dx.doi.org/10.7717/peerj-cs.1316 Text en ©2023 Atacak et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Algorithms and Analysis of Algorithms
Atacak, İsmail
Çıtlak, Oğuzhan
Doğru, İbrahim Alper
Application of interval type-2 fuzzy logic and type-1 fuzzy logic-based approaches to social networks for spam detection with combined feature capabilities
title Application of interval type-2 fuzzy logic and type-1 fuzzy logic-based approaches to social networks for spam detection with combined feature capabilities
title_full Application of interval type-2 fuzzy logic and type-1 fuzzy logic-based approaches to social networks for spam detection with combined feature capabilities
title_fullStr Application of interval type-2 fuzzy logic and type-1 fuzzy logic-based approaches to social networks for spam detection with combined feature capabilities
title_full_unstemmed Application of interval type-2 fuzzy logic and type-1 fuzzy logic-based approaches to social networks for spam detection with combined feature capabilities
title_short Application of interval type-2 fuzzy logic and type-1 fuzzy logic-based approaches to social networks for spam detection with combined feature capabilities
title_sort application of interval type-2 fuzzy logic and type-1 fuzzy logic-based approaches to social networks for spam detection with combined feature capabilities
topic Algorithms and Analysis of Algorithms
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280623/
https://www.ncbi.nlm.nih.gov/pubmed/37346510
http://dx.doi.org/10.7717/peerj-cs.1316
work_keys_str_mv AT atacakismail applicationofintervaltype2fuzzylogicandtype1fuzzylogicbasedapproachestosocialnetworksforspamdetectionwithcombinedfeaturecapabilities
AT cıtlakoguzhan applicationofintervaltype2fuzzylogicandtype1fuzzylogicbasedapproachestosocialnetworksforspamdetectionwithcombinedfeaturecapabilities
AT dogruibrahimalper applicationofintervaltype2fuzzylogicandtype1fuzzylogicbasedapproachestosocialnetworksforspamdetectionwithcombinedfeaturecapabilities