Cargando…

Multi-semantic feature fusion attention network for binary code similarity detection

Binary code similarity detection (BCSD) plays a big role in the process of binary application security test. It can be applied in several fields, such as software plagiarism detection, malware analysis, vulnerability detection. Most research is based on recurrent neural networks, which is difficult...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Bangling, Zhang, Yuting, Peng, Huaxi, Fan, Qiguang, He, Shen, Zhang, Yan, Shi, Songquan, Zhang, Yang, Ma, Ailiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10008825/
https://www.ncbi.nlm.nih.gov/pubmed/36907937
http://dx.doi.org/10.1038/s41598-023-31280-w
_version_ 1784905843771179008
author Li, Bangling
Zhang, Yuting
Peng, Huaxi
Fan, Qiguang
He, Shen
Zhang, Yan
Shi, Songquan
Zhang, Yang
Ma, Ailiang
author_facet Li, Bangling
Zhang, Yuting
Peng, Huaxi
Fan, Qiguang
He, Shen
Zhang, Yan
Shi, Songquan
Zhang, Yang
Ma, Ailiang
author_sort Li, Bangling
collection PubMed
description Binary code similarity detection (BCSD) plays a big role in the process of binary application security test. It can be applied in several fields, such as software plagiarism detection, malware analysis, vulnerability detection. Most research is based on recurrent neural networks, which is difficult to get the overall or long-distance semantic information of functions. Besides, exiting works simply extract high-level semantic features, lacking in-depth investigations on the potential mechanisms for fusing low-level and high-level semantic features. In this paper we propose a multi-semantic feature fusion attention network (MFFA-Net) for BCSD. MFFA-Net contains two critical modules: semantic feature fusion (SFF) and attention feature fusion (AFF). The SFF module concatenates multiple semantic features to represent the semantics of the function, which helps to obtain the overall semantic information of the function. The AFF module is designed to find useful information from various features, which assigns an attention matrix to research the relationship between features. In order to evaluate the proposed method, we made extensive experiments on two datasets. MFFA-Net can achieve a high degree of AUC at 99.6% and 98.3% respectively on the two datasets. The experimental results show that MFFA-Net has better performance for BCSD.
format Online
Article
Text
id pubmed-10008825
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-100088252023-03-14 Multi-semantic feature fusion attention network for binary code similarity detection Li, Bangling Zhang, Yuting Peng, Huaxi Fan, Qiguang He, Shen Zhang, Yan Shi, Songquan Zhang, Yang Ma, Ailiang Sci Rep Article Binary code similarity detection (BCSD) plays a big role in the process of binary application security test. It can be applied in several fields, such as software plagiarism detection, malware analysis, vulnerability detection. Most research is based on recurrent neural networks, which is difficult to get the overall or long-distance semantic information of functions. Besides, exiting works simply extract high-level semantic features, lacking in-depth investigations on the potential mechanisms for fusing low-level and high-level semantic features. In this paper we propose a multi-semantic feature fusion attention network (MFFA-Net) for BCSD. MFFA-Net contains two critical modules: semantic feature fusion (SFF) and attention feature fusion (AFF). The SFF module concatenates multiple semantic features to represent the semantics of the function, which helps to obtain the overall semantic information of the function. The AFF module is designed to find useful information from various features, which assigns an attention matrix to research the relationship between features. In order to evaluate the proposed method, we made extensive experiments on two datasets. MFFA-Net can achieve a high degree of AUC at 99.6% and 98.3% respectively on the two datasets. The experimental results show that MFFA-Net has better performance for BCSD. Nature Publishing Group UK 2023-03-12 /pmc/articles/PMC10008825/ /pubmed/36907937 http://dx.doi.org/10.1038/s41598-023-31280-w Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Li, Bangling
Zhang, Yuting
Peng, Huaxi
Fan, Qiguang
He, Shen
Zhang, Yan
Shi, Songquan
Zhang, Yang
Ma, Ailiang
Multi-semantic feature fusion attention network for binary code similarity detection
title Multi-semantic feature fusion attention network for binary code similarity detection
title_full Multi-semantic feature fusion attention network for binary code similarity detection
title_fullStr Multi-semantic feature fusion attention network for binary code similarity detection
title_full_unstemmed Multi-semantic feature fusion attention network for binary code similarity detection
title_short Multi-semantic feature fusion attention network for binary code similarity detection
title_sort multi-semantic feature fusion attention network for binary code similarity detection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10008825/
https://www.ncbi.nlm.nih.gov/pubmed/36907937
http://dx.doi.org/10.1038/s41598-023-31280-w
work_keys_str_mv AT libangling multisemanticfeaturefusionattentionnetworkforbinarycodesimilaritydetection
AT zhangyuting multisemanticfeaturefusionattentionnetworkforbinarycodesimilaritydetection
AT penghuaxi multisemanticfeaturefusionattentionnetworkforbinarycodesimilaritydetection
AT fanqiguang multisemanticfeaturefusionattentionnetworkforbinarycodesimilaritydetection
AT heshen multisemanticfeaturefusionattentionnetworkforbinarycodesimilaritydetection
AT zhangyan multisemanticfeaturefusionattentionnetworkforbinarycodesimilaritydetection
AT shisongquan multisemanticfeaturefusionattentionnetworkforbinarycodesimilaritydetection
AT zhangyang multisemanticfeaturefusionattentionnetworkforbinarycodesimilaritydetection
AT maailiang multisemanticfeaturefusionattentionnetworkforbinarycodesimilaritydetection