Cargando…

Graph neural networks and cross-protocol analysis for detecting malicious IP addresses

An internet protocol (IP) address is the foundation of the Internet, allowing connectivity between people, servers, Internet of Things, and services across the globe. Knowing what is connecting to what and where connections are initiated is crucial to accurately assess a company’s or individual’s se...

Descripción completa

Detalles Bibliográficos
Autores principales:	Huang, Yonghong, Negrete, Joanna, Wagener, John, Fralick, Celeste, Rodriguez, Armando, Peterson, Eric, Wosotowsky, Adam
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer International Publishing 2022
Materias:	Original Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9471032/ https://www.ncbi.nlm.nih.gov/pubmed/36120119 http://dx.doi.org/10.1007/s40747-022-00838-y

_version_	1784788975276261376
author	Huang, Yonghong Negrete, Joanna Wagener, John Fralick, Celeste Rodriguez, Armando Peterson, Eric Wosotowsky, Adam
author_facet	Huang, Yonghong Negrete, Joanna Wagener, John Fralick, Celeste Rodriguez, Armando Peterson, Eric Wosotowsky, Adam
author_sort	Huang, Yonghong
collection	PubMed
description	An internet protocol (IP) address is the foundation of the Internet, allowing connectivity between people, servers, Internet of Things, and services across the globe. Knowing what is connecting to what and where connections are initiated is crucial to accurately assess a company’s or individual’s security posture. IP reputation assessment can be quite complex because of the numerous services that may be hosted on that IP address. For example, an IP might be serving millions of websites from millions of different companies like web hosting companies often do, or it could be a large email system sending and receiving emails for millions of independent entities. The heterogeneous nature of an IP address typically makes it challenging to interpret the security risk. To make matters worse, adversaries understand this complexity and leverage the ambiguous nature of the IP reputation to exploit further unsuspecting Internet users or devices connected to the Internet. In addition, traditional techniques like dirty-listing cannot react quickly enough to changes in the security climate, nor can they scale large enough to detect new exploits that may be created and disappear in minutes. In this paper, we introduce the use of cross-protocol analysis and graph neural networks (GNNs) in semi-supervised learning to address the speed and scalability of assessing IP reputation. In the cross-protocol supervised approach, we combine features from the web, email, and domain name system (DNS) protocols to identify ones which are the most useful in discriminating suspicious and benign IPs. In our second experiment, we leverage the most discriminant features and incorporate them into the graph as nodes’ features. We use GNNs to pass messages from node to node, propagating the signal to the neighbors while also gaining the benefit of having the originating nodes being influenced by neighboring nodes. Thanks to the relational graph structure we can use only a small portion of labeled data and train the algorithm in a semi-supervised approach. Our dataset represents real-world data that is sparse and only contain a small percentage of IPs with verified clean or suspicious labels but are connected. The experimental results demonstrate that the system can achieve [Formula: see text] accuracy in detecting malicious IP addresses at scale with only [Formula: see text] of labeled data.
format	Online Article Text
id	pubmed-9471032
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Springer International Publishing
record_format	MEDLINE/PubMed
spelling	pubmed-94710322022-09-14 Graph neural networks and cross-protocol analysis for detecting malicious IP addresses Huang, Yonghong Negrete, Joanna Wagener, John Fralick, Celeste Rodriguez, Armando Peterson, Eric Wosotowsky, Adam Complex Intell Systems Original Article An internet protocol (IP) address is the foundation of the Internet, allowing connectivity between people, servers, Internet of Things, and services across the globe. Knowing what is connecting to what and where connections are initiated is crucial to accurately assess a company’s or individual’s security posture. IP reputation assessment can be quite complex because of the numerous services that may be hosted on that IP address. For example, an IP might be serving millions of websites from millions of different companies like web hosting companies often do, or it could be a large email system sending and receiving emails for millions of independent entities. The heterogeneous nature of an IP address typically makes it challenging to interpret the security risk. To make matters worse, adversaries understand this complexity and leverage the ambiguous nature of the IP reputation to exploit further unsuspecting Internet users or devices connected to the Internet. In addition, traditional techniques like dirty-listing cannot react quickly enough to changes in the security climate, nor can they scale large enough to detect new exploits that may be created and disappear in minutes. In this paper, we introduce the use of cross-protocol analysis and graph neural networks (GNNs) in semi-supervised learning to address the speed and scalability of assessing IP reputation. In the cross-protocol supervised approach, we combine features from the web, email, and domain name system (DNS) protocols to identify ones which are the most useful in discriminating suspicious and benign IPs. In our second experiment, we leverage the most discriminant features and incorporate them into the graph as nodes’ features. We use GNNs to pass messages from node to node, propagating the signal to the neighbors while also gaining the benefit of having the originating nodes being influenced by neighboring nodes. Thanks to the relational graph structure we can use only a small portion of labeled data and train the algorithm in a semi-supervised approach. Our dataset represents real-world data that is sparse and only contain a small percentage of IPs with verified clean or suspicious labels but are connected. The experimental results demonstrate that the system can achieve [Formula: see text] accuracy in detecting malicious IP addresses at scale with only [Formula: see text] of labeled data. Springer International Publishing 2022-09-14 /pmc/articles/PMC9471032/ /pubmed/36120119 http://dx.doi.org/10.1007/s40747-022-00838-y Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Original Article Huang, Yonghong Negrete, Joanna Wagener, John Fralick, Celeste Rodriguez, Armando Peterson, Eric Wosotowsky, Adam Graph neural networks and cross-protocol analysis for detecting malicious IP addresses
title	Graph neural networks and cross-protocol analysis for detecting malicious IP addresses
title_full	Graph neural networks and cross-protocol analysis for detecting malicious IP addresses
title_fullStr	Graph neural networks and cross-protocol analysis for detecting malicious IP addresses
title_full_unstemmed	Graph neural networks and cross-protocol analysis for detecting malicious IP addresses
title_short	Graph neural networks and cross-protocol analysis for detecting malicious IP addresses
title_sort	graph neural networks and cross-protocol analysis for detecting malicious ip addresses
topic	Original Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9471032/ https://www.ncbi.nlm.nih.gov/pubmed/36120119 http://dx.doi.org/10.1007/s40747-022-00838-y
work_keys_str_mv	AT huangyonghong graphneuralnetworksandcrossprotocolanalysisfordetectingmaliciousipaddresses AT negretejoanna graphneuralnetworksandcrossprotocolanalysisfordetectingmaliciousipaddresses AT wagenerjohn graphneuralnetworksandcrossprotocolanalysisfordetectingmaliciousipaddresses AT fralickceleste graphneuralnetworksandcrossprotocolanalysisfordetectingmaliciousipaddresses AT rodriguezarmando graphneuralnetworksandcrossprotocolanalysisfordetectingmaliciousipaddresses AT petersoneric graphneuralnetworksandcrossprotocolanalysisfordetectingmaliciousipaddresses AT wosotowskyadam graphneuralnetworksandcrossprotocolanalysisfordetectingmaliciousipaddresses

Graph neural networks and cross-protocol analysis for detecting malicious IP addresses

Ejemplares similares