Cargando…
Ridge count thresholding to uncover coordinated networks during onset of the Covid-19 pandemic
In order to combat information operations (IO) and disinformation campaigns, one must look at the behaviors of the accounts pushing specific narratives and stories through social media, not at the content itself. In this work, we present a new process for extracting tweet storms and uncovering netwo...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Vienna
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8956151/ https://www.ncbi.nlm.nih.gov/pubmed/35368620 http://dx.doi.org/10.1007/s13278-022-00873-0 |
_version_ | 1784676506387087360 |
---|---|
author | Kirn, Spencer Lee Hinders, Mark K. |
author_facet | Kirn, Spencer Lee Hinders, Mark K. |
author_sort | Kirn, Spencer Lee |
collection | PubMed |
description | In order to combat information operations (IO) and disinformation campaigns, one must look at the behaviors of the accounts pushing specific narratives and stories through social media, not at the content itself. In this work, we present a new process for extracting tweet storms and uncovering networks of accounts that are working in a coordinated fashion using ridge count thresholding (RCT). To do this, we started with a dataset of 60 million individual tweets from the early weeks of the Covid-19 pandemic. Coherent topics are extracted from this data by testing three different preprocessing pipelines and applying Orthogonal Nonnegative Matrix Factorization (ONMF). The most effective preprocessing pipeline used hashtag preclustering to downselect the total dataset to the 7 million tweets that included the top hashtags. Each topic identified by ONMF is described by a topic-tweet signal, crafted using the time stamp included in each tweet’s metadata. These signals were broken down into tweet storms using RCT, which is calculated from the Dynamic Wavelet Fingerprint transform of each topic-tweet signal. Each tweet storm described a time of increased activity around a topic. Tweet storms identified in this way each represent some behavior in the underlying network. In total, we identified 39,817 total tweet storms that included about 2 million unique tweets. These tweet storms were used to identify networks of accounts that commonly co-occur within tweet storms to isolate those communities most responsible for driving narratives and pushing stories through social media. Through this process, we were able to identify 22 unique networks of accounts that were densely connected based on RCT tweet storm identification. Many of the identified networks exhibit obvious inauthentic behaviors that are potentially a part of an IO campaign. |
format | Online Article Text |
id | pubmed-8956151 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer Vienna |
record_format | MEDLINE/PubMed |
spelling | pubmed-89561512022-03-28 Ridge count thresholding to uncover coordinated networks during onset of the Covid-19 pandemic Kirn, Spencer Lee Hinders, Mark K. Soc Netw Anal Min Original Article In order to combat information operations (IO) and disinformation campaigns, one must look at the behaviors of the accounts pushing specific narratives and stories through social media, not at the content itself. In this work, we present a new process for extracting tweet storms and uncovering networks of accounts that are working in a coordinated fashion using ridge count thresholding (RCT). To do this, we started with a dataset of 60 million individual tweets from the early weeks of the Covid-19 pandemic. Coherent topics are extracted from this data by testing three different preprocessing pipelines and applying Orthogonal Nonnegative Matrix Factorization (ONMF). The most effective preprocessing pipeline used hashtag preclustering to downselect the total dataset to the 7 million tweets that included the top hashtags. Each topic identified by ONMF is described by a topic-tweet signal, crafted using the time stamp included in each tweet’s metadata. These signals were broken down into tweet storms using RCT, which is calculated from the Dynamic Wavelet Fingerprint transform of each topic-tweet signal. Each tweet storm described a time of increased activity around a topic. Tweet storms identified in this way each represent some behavior in the underlying network. In total, we identified 39,817 total tweet storms that included about 2 million unique tweets. These tweet storms were used to identify networks of accounts that commonly co-occur within tweet storms to isolate those communities most responsible for driving narratives and pushing stories through social media. Through this process, we were able to identify 22 unique networks of accounts that were densely connected based on RCT tweet storm identification. Many of the identified networks exhibit obvious inauthentic behaviors that are potentially a part of an IO campaign. Springer Vienna 2022-03-25 2022 /pmc/articles/PMC8956151/ /pubmed/35368620 http://dx.doi.org/10.1007/s13278-022-00873-0 Text en © The Author(s), under exclusive licence to Springer-Verlag GmbH Austria, part of Springer Nature 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Article Kirn, Spencer Lee Hinders, Mark K. Ridge count thresholding to uncover coordinated networks during onset of the Covid-19 pandemic |
title | Ridge count thresholding to uncover coordinated networks during onset of the Covid-19 pandemic |
title_full | Ridge count thresholding to uncover coordinated networks during onset of the Covid-19 pandemic |
title_fullStr | Ridge count thresholding to uncover coordinated networks during onset of the Covid-19 pandemic |
title_full_unstemmed | Ridge count thresholding to uncover coordinated networks during onset of the Covid-19 pandemic |
title_short | Ridge count thresholding to uncover coordinated networks during onset of the Covid-19 pandemic |
title_sort | ridge count thresholding to uncover coordinated networks during onset of the covid-19 pandemic |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8956151/ https://www.ncbi.nlm.nih.gov/pubmed/35368620 http://dx.doi.org/10.1007/s13278-022-00873-0 |
work_keys_str_mv | AT kirnspencerlee ridgecountthresholdingtouncovercoordinatednetworksduringonsetofthecovid19pandemic AT hindersmarkk ridgecountthresholdingtouncovercoordinatednetworksduringonsetofthecovid19pandemic |