Cargando…

Malicious source code detection using a translation model

Modern software development often relies on open-source code sharing. Open-source code reuse, however, allows hackers to access wide developer communities, thereby potentially affecting many products. An increasing number of such “supply chain attacks” have occurred in recent years, taking advantage...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tsfaty, Chen, Fire, Michael
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10382987/ https://www.ncbi.nlm.nih.gov/pubmed/37521045 http://dx.doi.org/10.1016/j.patter.2023.100773

_version_	1785080795821506560
author	Tsfaty, Chen Fire, Michael
author_facet	Tsfaty, Chen Fire, Michael
author_sort	Tsfaty, Chen
collection	PubMed
description	Modern software development often relies on open-source code sharing. Open-source code reuse, however, allows hackers to access wide developer communities, thereby potentially affecting many products. An increasing number of such “supply chain attacks” have occurred in recent years, taking advantage of open-source software development practices. Here, we introduce the Malicious Source code Detection using a Translation model (MSDT) algorithm. MSDT is a novel deep-learning-based analysis method that detects real-world code injections into source code packages. We have tested MSDT by embedding examples from a dataset of over 600,000 different functions and then applying a clustering algorithm to the resulting embedding vectors to identify malicious functions by detecting outliers. We evaluated MSDT’s performance with extensive experiments and demonstrated that MSDT could detect malicious code injections with precision@k values of up to 0.909.
format	Online Article Text
id	pubmed-10382987
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-103829872023-07-30 Malicious source code detection using a translation model Tsfaty, Chen Fire, Michael Patterns (N Y) Article Modern software development often relies on open-source code sharing. Open-source code reuse, however, allows hackers to access wide developer communities, thereby potentially affecting many products. An increasing number of such “supply chain attacks” have occurred in recent years, taking advantage of open-source software development practices. Here, we introduce the Malicious Source code Detection using a Translation model (MSDT) algorithm. MSDT is a novel deep-learning-based analysis method that detects real-world code injections into source code packages. We have tested MSDT by embedding examples from a dataset of over 600,000 different functions and then applying a clustering algorithm to the resulting embedding vectors to identify malicious functions by detecting outliers. We evaluated MSDT’s performance with extensive experiments and demonstrated that MSDT could detect malicious code injections with precision@k values of up to 0.909. Elsevier 2023-06-06 /pmc/articles/PMC10382987/ /pubmed/37521045 http://dx.doi.org/10.1016/j.patter.2023.100773 Text en © 2023 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Tsfaty, Chen Fire, Michael Malicious source code detection using a translation model
title	Malicious source code detection using a translation model
title_full	Malicious source code detection using a translation model
title_fullStr	Malicious source code detection using a translation model
title_full_unstemmed	Malicious source code detection using a translation model
title_short	Malicious source code detection using a translation model
title_sort	malicious source code detection using a translation model
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10382987/ https://www.ncbi.nlm.nih.gov/pubmed/37521045 http://dx.doi.org/10.1016/j.patter.2023.100773
work_keys_str_mv	AT tsfatychen malicioussourcecodedetectionusingatranslationmodel AT firemichael malicioussourcecodedetectionusingatranslationmodel

Malicious source code detection using a translation model

Ejemplares similares