Cargando…

Knowledge mining of unstructured information: application to cyber domain

Information on cyber-related crimes, incidents, and conflicts is abundantly available in numerous open online sources. However, processing large volumes and streams of data is a challenging task for the analysts and experts, and entails the need for newer methods and techniques. In this article we p...

Descripción completa

Detalles Bibliográficos
Autores principales: Takko, Tuomas, Bhattacharya, Kunal, Lehto, Martti, Jalasvirta, Pertti, Cederberg, Aapo, Kaski, Kimmo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9889742/
https://www.ncbi.nlm.nih.gov/pubmed/36720897
http://dx.doi.org/10.1038/s41598-023-28796-6
_version_ 1784880798937120768
author Takko, Tuomas
Bhattacharya, Kunal
Lehto, Martti
Jalasvirta, Pertti
Cederberg, Aapo
Kaski, Kimmo
author_facet Takko, Tuomas
Bhattacharya, Kunal
Lehto, Martti
Jalasvirta, Pertti
Cederberg, Aapo
Kaski, Kimmo
author_sort Takko, Tuomas
collection PubMed
description Information on cyber-related crimes, incidents, and conflicts is abundantly available in numerous open online sources. However, processing large volumes and streams of data is a challenging task for the analysts and experts, and entails the need for newer methods and techniques. In this article we present and implement a novel knowledge graph and knowledge mining framework for extracting the relevant information from free-form text about incidents in the cyber domain. The computational framework includes a machine learning-based pipeline for generating graphs of organizations, countries, industries, products and attackers with a non-technical cyber-ontology. The extracted knowledge graph is utilized to estimate the incidence of cyberattacks within a given graph configuration. We use publicly available collections of real cyber-incident reports to test the efficacy of our methods. The knowledge extraction is found to be sufficiently accurate, and the graph-based threat estimation demonstrates a level of correlation with the actual records of attacks. In practical use, an analyst utilizing the presented framework can infer additional information from the current cyber-landscape in terms of the risk to various entities and its propagation between industries and countries.
format Online
Article
Text
id pubmed-9889742
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-98897422023-02-02 Knowledge mining of unstructured information: application to cyber domain Takko, Tuomas Bhattacharya, Kunal Lehto, Martti Jalasvirta, Pertti Cederberg, Aapo Kaski, Kimmo Sci Rep Article Information on cyber-related crimes, incidents, and conflicts is abundantly available in numerous open online sources. However, processing large volumes and streams of data is a challenging task for the analysts and experts, and entails the need for newer methods and techniques. In this article we present and implement a novel knowledge graph and knowledge mining framework for extracting the relevant information from free-form text about incidents in the cyber domain. The computational framework includes a machine learning-based pipeline for generating graphs of organizations, countries, industries, products and attackers with a non-technical cyber-ontology. The extracted knowledge graph is utilized to estimate the incidence of cyberattacks within a given graph configuration. We use publicly available collections of real cyber-incident reports to test the efficacy of our methods. The knowledge extraction is found to be sufficiently accurate, and the graph-based threat estimation demonstrates a level of correlation with the actual records of attacks. In practical use, an analyst utilizing the presented framework can infer additional information from the current cyber-landscape in terms of the risk to various entities and its propagation between industries and countries. Nature Publishing Group UK 2023-01-31 /pmc/articles/PMC9889742/ /pubmed/36720897 http://dx.doi.org/10.1038/s41598-023-28796-6 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Takko, Tuomas
Bhattacharya, Kunal
Lehto, Martti
Jalasvirta, Pertti
Cederberg, Aapo
Kaski, Kimmo
Knowledge mining of unstructured information: application to cyber domain
title Knowledge mining of unstructured information: application to cyber domain
title_full Knowledge mining of unstructured information: application to cyber domain
title_fullStr Knowledge mining of unstructured information: application to cyber domain
title_full_unstemmed Knowledge mining of unstructured information: application to cyber domain
title_short Knowledge mining of unstructured information: application to cyber domain
title_sort knowledge mining of unstructured information: application to cyber domain
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9889742/
https://www.ncbi.nlm.nih.gov/pubmed/36720897
http://dx.doi.org/10.1038/s41598-023-28796-6
work_keys_str_mv AT takkotuomas knowledgeminingofunstructuredinformationapplicationtocyberdomain
AT bhattacharyakunal knowledgeminingofunstructuredinformationapplicationtocyberdomain
AT lehtomartti knowledgeminingofunstructuredinformationapplicationtocyberdomain
AT jalasvirtapertti knowledgeminingofunstructuredinformationapplicationtocyberdomain
AT cederbergaapo knowledgeminingofunstructuredinformationapplicationtocyberdomain
AT kaskikimmo knowledgeminingofunstructuredinformationapplicationtocyberdomain