Cargando…

A detection method for android application security based on TF-IDF and machine learning

Android is the most widely used mobile operating system (OS). A large number of third-party Android application (app) markets have emerged. The absence of third-party market regulation has prompted research institutions to propose different malware detection techniques. However, due to improvements...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yuan, Hongli, Tang, Yongchuan, Sun, Wenjuan, Liu, Li
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2020
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7485785/ https://www.ncbi.nlm.nih.gov/pubmed/32915836 http://dx.doi.org/10.1371/journal.pone.0238694

_version_	1783581214798512128
author	Yuan, Hongli Tang, Yongchuan Sun, Wenjuan Liu, Li
author_facet	Yuan, Hongli Tang, Yongchuan Sun, Wenjuan Liu, Li
author_sort	Yuan, Hongli
collection	PubMed
description	Android is the most widely used mobile operating system (OS). A large number of third-party Android application (app) markets have emerged. The absence of third-party market regulation has prompted research institutions to propose different malware detection techniques. However, due to improvements of malware itself and Android system, it is difficult to design a detection method that can efficiently and effectively detect malicious apps for a long time. Meanwhile, adopting more features will increase the complexity of the model and the computational cost of the system. Permissions play a vital role in the security of the Android apps. Term Frequency—Inverse Document Frequency (TF-IDF) is used to assess the importance of a word for a file set in a corpus. The static analysis method does not need to run the app. It can efficiently and accurately extract the permissions from an app. Based on this cognition and perspective, in this paper, a new static detection method based on TF-IDF and Machine Learning is proposed. The system permissions are extracted in Android application package’s (Apk’s) manifest file. TF-IDF algorithm is used to calculate the permission value (PV) of each permission and the sensitivity value of apk (SVOA) of each app. The SVOA and the number of the used permissions are learned and tested by machine learning. 6070 benign apps and 9419 malware are used to evaluate the proposed approach. The experiment results show that only use dangerous permissions or the number of used permissions can’t accurately distinguish whether an app is malicious or benign. For malware detection, the proposed approach achieve up to 99.5% accuracy and the learning and training time only needs 0.05s. For malware families detection, the accuracy is 99.6%. The results indicate that the method for unknown/new sample’s detection accuracy is 92.71%. Compared against other state-of-the-art approaches, the proposed approach is more effective by detecting malware and malware families.
format	Online Article Text
id	pubmed-7485785
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-74857852020-09-21 A detection method for android application security based on TF-IDF and machine learning Yuan, Hongli Tang, Yongchuan Sun, Wenjuan Liu, Li PLoS One Research Article Android is the most widely used mobile operating system (OS). A large number of third-party Android application (app) markets have emerged. The absence of third-party market regulation has prompted research institutions to propose different malware detection techniques. However, due to improvements of malware itself and Android system, it is difficult to design a detection method that can efficiently and effectively detect malicious apps for a long time. Meanwhile, adopting more features will increase the complexity of the model and the computational cost of the system. Permissions play a vital role in the security of the Android apps. Term Frequency—Inverse Document Frequency (TF-IDF) is used to assess the importance of a word for a file set in a corpus. The static analysis method does not need to run the app. It can efficiently and accurately extract the permissions from an app. Based on this cognition and perspective, in this paper, a new static detection method based on TF-IDF and Machine Learning is proposed. The system permissions are extracted in Android application package’s (Apk’s) manifest file. TF-IDF algorithm is used to calculate the permission value (PV) of each permission and the sensitivity value of apk (SVOA) of each app. The SVOA and the number of the used permissions are learned and tested by machine learning. 6070 benign apps and 9419 malware are used to evaluate the proposed approach. The experiment results show that only use dangerous permissions or the number of used permissions can’t accurately distinguish whether an app is malicious or benign. For malware detection, the proposed approach achieve up to 99.5% accuracy and the learning and training time only needs 0.05s. For malware families detection, the accuracy is 99.6%. The results indicate that the method for unknown/new sample’s detection accuracy is 92.71%. Compared against other state-of-the-art approaches, the proposed approach is more effective by detecting malware and malware families. Public Library of Science 2020-09-11 /pmc/articles/PMC7485785/ /pubmed/32915836 http://dx.doi.org/10.1371/journal.pone.0238694 Text en © 2020 Yuan et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Yuan, Hongli Tang, Yongchuan Sun, Wenjuan Liu, Li A detection method for android application security based on TF-IDF and machine learning
title	A detection method for android application security based on TF-IDF and machine learning
title_full	A detection method for android application security based on TF-IDF and machine learning
title_fullStr	A detection method for android application security based on TF-IDF and machine learning
title_full_unstemmed	A detection method for android application security based on TF-IDF and machine learning
title_short	A detection method for android application security based on TF-IDF and machine learning
title_sort	detection method for android application security based on tf-idf and machine learning
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7485785/ https://www.ncbi.nlm.nih.gov/pubmed/32915836 http://dx.doi.org/10.1371/journal.pone.0238694
work_keys_str_mv	AT yuanhongli adetectionmethodforandroidapplicationsecuritybasedontfidfandmachinelearning AT tangyongchuan adetectionmethodforandroidapplicationsecuritybasedontfidfandmachinelearning AT sunwenjuan adetectionmethodforandroidapplicationsecuritybasedontfidfandmachinelearning AT liuli adetectionmethodforandroidapplicationsecuritybasedontfidfandmachinelearning AT yuanhongli detectionmethodforandroidapplicationsecuritybasedontfidfandmachinelearning AT tangyongchuan detectionmethodforandroidapplicationsecuritybasedontfidfandmachinelearning AT sunwenjuan detectionmethodforandroidapplicationsecuritybasedontfidfandmachinelearning AT liuli detectionmethodforandroidapplicationsecuritybasedontfidfandmachinelearning

A detection method for android application security based on TF-IDF and machine learning

Ejemplares similares