Cargando…

Android Spyware Detection Using Machine Learning: A Novel Dataset

Smartphones are an essential part of all aspects of our lives. Socially, politically, and commercially, there is almost complete reliance on smartphones as a communication tool, a source of information, and for entertainment. Rapid developments in the world of information and cyber security have nec...

Descripción completa

Detalles Bibliográficos
Autores principales: Qabalin, Majdi K., Naser, Muawya, Alkasassbeh, Mouhammd
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9371186/
https://www.ncbi.nlm.nih.gov/pubmed/35957337
http://dx.doi.org/10.3390/s22155765
_version_ 1784767059814514688
author Qabalin, Majdi K.
Naser, Muawya
Alkasassbeh, Mouhammd
author_facet Qabalin, Majdi K.
Naser, Muawya
Alkasassbeh, Mouhammd
author_sort Qabalin, Majdi K.
collection PubMed
description Smartphones are an essential part of all aspects of our lives. Socially, politically, and commercially, there is almost complete reliance on smartphones as a communication tool, a source of information, and for entertainment. Rapid developments in the world of information and cyber security have necessitated close attention to the privacy and protection of smartphone data. Spyware detection systems have recently been developed as a promising and encouraging solution for smartphone users’ privacy protection. The Android operating system is the most widely used worldwide, making it a significant target for many parties interested in targeting smartphone users’ privacy. This paper introduces a novel dataset collected in a realistic environment, obtained through a novel data collection methodology based on a unified activity list. The data are divided into three main classes: the first class represents normal smartphone traffic; the second class represents traffic data for the spyware installation process; finally, the third class represents spyware operation traffic data. The random forest classification algorithm was adopted to validate this dataset and the proposed model. Two methodologies were adopted for data classification: binary-class and multi-class classification. Good results were achieved in terms of accuracy. The overall average accuracy was 79% for the binary-class classification, and 77% for the multi-class classification. In the multi-class approach, the detection accuracy for spyware systems (UMobix, TheWiSPY, MobileSPY, FlexiSPY, and mSPY) was 90%, 83.7%, 69.3%, 69.2%, and 73.4%, respectively; in binary-class classification, the detection accuracy for spyware systems (UMobix, TheWiSPY, MobileSPY, FlexiSPY, and mSPY) was 93.9%, 85.63%, 71%, 72.3%, and 75.96%; respectively.
format Online
Article
Text
id pubmed-9371186
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-93711862022-08-12 Android Spyware Detection Using Machine Learning: A Novel Dataset Qabalin, Majdi K. Naser, Muawya Alkasassbeh, Mouhammd Sensors (Basel) Article Smartphones are an essential part of all aspects of our lives. Socially, politically, and commercially, there is almost complete reliance on smartphones as a communication tool, a source of information, and for entertainment. Rapid developments in the world of information and cyber security have necessitated close attention to the privacy and protection of smartphone data. Spyware detection systems have recently been developed as a promising and encouraging solution for smartphone users’ privacy protection. The Android operating system is the most widely used worldwide, making it a significant target for many parties interested in targeting smartphone users’ privacy. This paper introduces a novel dataset collected in a realistic environment, obtained through a novel data collection methodology based on a unified activity list. The data are divided into three main classes: the first class represents normal smartphone traffic; the second class represents traffic data for the spyware installation process; finally, the third class represents spyware operation traffic data. The random forest classification algorithm was adopted to validate this dataset and the proposed model. Two methodologies were adopted for data classification: binary-class and multi-class classification. Good results were achieved in terms of accuracy. The overall average accuracy was 79% for the binary-class classification, and 77% for the multi-class classification. In the multi-class approach, the detection accuracy for spyware systems (UMobix, TheWiSPY, MobileSPY, FlexiSPY, and mSPY) was 90%, 83.7%, 69.3%, 69.2%, and 73.4%, respectively; in binary-class classification, the detection accuracy for spyware systems (UMobix, TheWiSPY, MobileSPY, FlexiSPY, and mSPY) was 93.9%, 85.63%, 71%, 72.3%, and 75.96%; respectively. MDPI 2022-08-02 /pmc/articles/PMC9371186/ /pubmed/35957337 http://dx.doi.org/10.3390/s22155765 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Qabalin, Majdi K.
Naser, Muawya
Alkasassbeh, Mouhammd
Android Spyware Detection Using Machine Learning: A Novel Dataset
title Android Spyware Detection Using Machine Learning: A Novel Dataset
title_full Android Spyware Detection Using Machine Learning: A Novel Dataset
title_fullStr Android Spyware Detection Using Machine Learning: A Novel Dataset
title_full_unstemmed Android Spyware Detection Using Machine Learning: A Novel Dataset
title_short Android Spyware Detection Using Machine Learning: A Novel Dataset
title_sort android spyware detection using machine learning: a novel dataset
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9371186/
https://www.ncbi.nlm.nih.gov/pubmed/35957337
http://dx.doi.org/10.3390/s22155765
work_keys_str_mv AT qabalinmajdik androidspywaredetectionusingmachinelearninganoveldataset
AT nasermuawya androidspywaredetectionusingmachinelearninganoveldataset
AT alkasassbehmouhammd androidspywaredetectionusingmachinelearninganoveldataset