Cargando…

Inferring the Population Mean with Second-Order Information in Online Social Networks

With the increasing use of online social networking platforms, online surveys are widely used in many fields, e.g., public health, business and sociology, to collect samples and to infer the population characteristics through self-reported data of respondents. Although the online surveys can protect...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Saran, Lu, Xin, Liu, Zhong, Jia, Zhongwei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7512998/
https://www.ncbi.nlm.nih.gov/pubmed/33265570
http://dx.doi.org/10.3390/e20060480
_version_ 1783586286759575552
author Chen, Saran
Lu, Xin
Liu, Zhong
Jia, Zhongwei
author_facet Chen, Saran
Lu, Xin
Liu, Zhong
Jia, Zhongwei
author_sort Chen, Saran
collection PubMed
description With the increasing use of online social networking platforms, online surveys are widely used in many fields, e.g., public health, business and sociology, to collect samples and to infer the population characteristics through self-reported data of respondents. Although the online surveys can protect the privacy of respondents, self-reporting is challenged by a low response rate and unreliable answers when the survey contains sensitive questions, such as drug use, sexual behaviors, abortion or criminal activity. To overcome this limitation, this paper develops an approach that collects the second-order information of the respondents, i.e., asking them about the characteristics of their friends, instead of asking the respondents’ own characteristics directly. Then, we generate the inference about the population variable with the Hansen-Hurwitz estimator for the two classic sampling strategies (simple random sampling or random walk-based sampling). The method is evaluated by simulations on both artificial and real-world networks. Results show that the method is able to generate population estimates with high accuracy without knowing the respondents’ own characteristics, and the biases of estimates under various settings are relatively small and are within acceptable limits. The new method offers an alternative way for implementing surveys online and is expected to be able to collect more reliable data with improved population inference on sensitive variables.
format Online
Article
Text
id pubmed-7512998
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75129982020-11-09 Inferring the Population Mean with Second-Order Information in Online Social Networks Chen, Saran Lu, Xin Liu, Zhong Jia, Zhongwei Entropy (Basel) Article With the increasing use of online social networking platforms, online surveys are widely used in many fields, e.g., public health, business and sociology, to collect samples and to infer the population characteristics through self-reported data of respondents. Although the online surveys can protect the privacy of respondents, self-reporting is challenged by a low response rate and unreliable answers when the survey contains sensitive questions, such as drug use, sexual behaviors, abortion or criminal activity. To overcome this limitation, this paper develops an approach that collects the second-order information of the respondents, i.e., asking them about the characteristics of their friends, instead of asking the respondents’ own characteristics directly. Then, we generate the inference about the population variable with the Hansen-Hurwitz estimator for the two classic sampling strategies (simple random sampling or random walk-based sampling). The method is evaluated by simulations on both artificial and real-world networks. Results show that the method is able to generate population estimates with high accuracy without knowing the respondents’ own characteristics, and the biases of estimates under various settings are relatively small and are within acceptable limits. The new method offers an alternative way for implementing surveys online and is expected to be able to collect more reliable data with improved population inference on sensitive variables. MDPI 2018-06-20 /pmc/articles/PMC7512998/ /pubmed/33265570 http://dx.doi.org/10.3390/e20060480 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Chen, Saran
Lu, Xin
Liu, Zhong
Jia, Zhongwei
Inferring the Population Mean with Second-Order Information in Online Social Networks
title Inferring the Population Mean with Second-Order Information in Online Social Networks
title_full Inferring the Population Mean with Second-Order Information in Online Social Networks
title_fullStr Inferring the Population Mean with Second-Order Information in Online Social Networks
title_full_unstemmed Inferring the Population Mean with Second-Order Information in Online Social Networks
title_short Inferring the Population Mean with Second-Order Information in Online Social Networks
title_sort inferring the population mean with second-order information in online social networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7512998/
https://www.ncbi.nlm.nih.gov/pubmed/33265570
http://dx.doi.org/10.3390/e20060480
work_keys_str_mv AT chensaran inferringthepopulationmeanwithsecondorderinformationinonlinesocialnetworks
AT luxin inferringthepopulationmeanwithsecondorderinformationinonlinesocialnetworks
AT liuzhong inferringthepopulationmeanwithsecondorderinformationinonlinesocialnetworks
AT jiazhongwei inferringthepopulationmeanwithsecondorderinformationinonlinesocialnetworks