Cargando…

Towards a standard sampling methodology on online social networks: collecting global trends on Twitter

One of the most significant current challenges in large-scale online social networks, is to establish a concise and coherent method aimed to collect and summarize data. Sampling the content of an Online Social Network (OSN) plays an important role as a knowledge discovery tool. It is becoming increa...

Descripción completa

Detalles Bibliográficos
Autores principales: Piña-García, C. A., Gershenson, Carlos, Siqueiros-García, J. Mario
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245126/
https://www.ncbi.nlm.nih.gov/pubmed/30533495
http://dx.doi.org/10.1007/s41109-016-0004-1
_version_ 1783372183349755904
author Piña-García, C. A.
Gershenson, Carlos
Siqueiros-García, J. Mario
author_facet Piña-García, C. A.
Gershenson, Carlos
Siqueiros-García, J. Mario
author_sort Piña-García, C. A.
collection PubMed
description One of the most significant current challenges in large-scale online social networks, is to establish a concise and coherent method aimed to collect and summarize data. Sampling the content of an Online Social Network (OSN) plays an important role as a knowledge discovery tool. It is becoming increasingly difficult to ignore the fact that current sampling methods must cope with a lack of a full sampling frame i.e., there is an imposed condition determined by a limited data access. In addition, another key aspect to take into account is the huge amount of data generated by users of social networking services such as Twitter, which is perhaps the most influential microblogging service producing approximately 500 million tweets per day. In this context, due to the size of Twitter, which is problematic to be measured, the analysis of the entire network is infeasible and sampling is unavoidable. In addition, we strongly believe that there is a clear need to develop a new methodology to collect information on social networks (social mining). In this regard, we think that this paper introduces a set of random strategies that could be considered as a reliable alternative to gather global trends on Twitter. It is important to note that this research pretends to show some initial ideas in how convenient are random walks to extract information or global trends. The main purpose of this study, is to propose a suitable methodology to carry out an efficient collecting process via three random strategies: Brownian, Illusion and Reservoir. These random strategies will be applied through a Metropolis-Hastings Random Walk (MHRW). We show that interesting insights can be obtained by sampling emerging global trends on Twitter. The study also offers some important insights providing descriptive statistics and graphical description from the preliminary experiments.
format Online
Article
Text
id pubmed-6245126
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-62451262018-12-06 Towards a standard sampling methodology on online social networks: collecting global trends on Twitter Piña-García, C. A. Gershenson, Carlos Siqueiros-García, J. Mario Appl Netw Sci Research One of the most significant current challenges in large-scale online social networks, is to establish a concise and coherent method aimed to collect and summarize data. Sampling the content of an Online Social Network (OSN) plays an important role as a knowledge discovery tool. It is becoming increasingly difficult to ignore the fact that current sampling methods must cope with a lack of a full sampling frame i.e., there is an imposed condition determined by a limited data access. In addition, another key aspect to take into account is the huge amount of data generated by users of social networking services such as Twitter, which is perhaps the most influential microblogging service producing approximately 500 million tweets per day. In this context, due to the size of Twitter, which is problematic to be measured, the analysis of the entire network is infeasible and sampling is unavoidable. In addition, we strongly believe that there is a clear need to develop a new methodology to collect information on social networks (social mining). In this regard, we think that this paper introduces a set of random strategies that could be considered as a reliable alternative to gather global trends on Twitter. It is important to note that this research pretends to show some initial ideas in how convenient are random walks to extract information or global trends. The main purpose of this study, is to propose a suitable methodology to carry out an efficient collecting process via three random strategies: Brownian, Illusion and Reservoir. These random strategies will be applied through a Metropolis-Hastings Random Walk (MHRW). We show that interesting insights can be obtained by sampling emerging global trends on Twitter. The study also offers some important insights providing descriptive statistics and graphical description from the preliminary experiments. Springer International Publishing 2016-06-01 2016 /pmc/articles/PMC6245126/ /pubmed/30533495 http://dx.doi.org/10.1007/s41109-016-0004-1 Text en © Piña-Garcia et al. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Research
Piña-García, C. A.
Gershenson, Carlos
Siqueiros-García, J. Mario
Towards a standard sampling methodology on online social networks: collecting global trends on Twitter
title Towards a standard sampling methodology on online social networks: collecting global trends on Twitter
title_full Towards a standard sampling methodology on online social networks: collecting global trends on Twitter
title_fullStr Towards a standard sampling methodology on online social networks: collecting global trends on Twitter
title_full_unstemmed Towards a standard sampling methodology on online social networks: collecting global trends on Twitter
title_short Towards a standard sampling methodology on online social networks: collecting global trends on Twitter
title_sort towards a standard sampling methodology on online social networks: collecting global trends on twitter
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245126/
https://www.ncbi.nlm.nih.gov/pubmed/30533495
http://dx.doi.org/10.1007/s41109-016-0004-1
work_keys_str_mv AT pinagarciaca towardsastandardsamplingmethodologyononlinesocialnetworkscollectingglobaltrendsontwitter
AT gershensoncarlos towardsastandardsamplingmethodologyononlinesocialnetworkscollectingglobaltrendsontwitter
AT siqueirosgarciajmario towardsastandardsamplingmethodologyononlinesocialnetworkscollectingglobaltrendsontwitter