Cargando…

True Randomness from Big Data

Generating random bits is a difficult task, which is important for physical systems simulation, cryptography, and many applications that rely on high-quality random bits. Our contribution is to show how to generate provably random bits from uncertain events whose outcomes are routinely recorded in t...

Descripción completa

Detalles Bibliográficos
Autores principales:	Papakonstantinou, Periklis A., Woodruff, David P., Yang, Guang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group 2016
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5036032/ https://www.ncbi.nlm.nih.gov/pubmed/27666514 http://dx.doi.org/10.1038/srep33740

_version_	1782455480118935552
author	Papakonstantinou, Periklis A. Woodruff, David P. Yang, Guang
author_facet	Papakonstantinou, Periklis A. Woodruff, David P. Yang, Guang
author_sort	Papakonstantinou, Periklis A.
collection	PubMed
description	Generating random bits is a difficult task, which is important for physical systems simulation, cryptography, and many applications that rely on high-quality random bits. Our contribution is to show how to generate provably random bits from uncertain events whose outcomes are routinely recorded in the form of massive data sets. These include scientific data sets, such as in astronomics, genomics, as well as data produced by individuals, such as internet search logs, sensor networks, and social network feeds. We view the generation of such data as the sampling process from a big source, which is a random variable of size at least a few gigabytes. Our view initiates the study of big sources in the randomness extraction literature. Previous approaches for big sources rely on statistical assumptions about the samples. We introduce a general method that provably extracts almost-uniform random bits from big sources and extensively validate it empirically on real data sets. The experimental findings indicate that our method is efficient enough to handle large enough sources, while previous extractor constructions are not efficient enough to be practical. Quality-wise, our method at least matches quantum randomness expanders and classical world empirical extractors as measured by standardized tests.
format	Online Article Text
id	pubmed-5036032
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	Nature Publishing Group
record_format	MEDLINE/PubMed
spelling	pubmed-50360322016-09-30 True Randomness from Big Data Papakonstantinou, Periklis A. Woodruff, David P. Yang, Guang Sci Rep Article Generating random bits is a difficult task, which is important for physical systems simulation, cryptography, and many applications that rely on high-quality random bits. Our contribution is to show how to generate provably random bits from uncertain events whose outcomes are routinely recorded in the form of massive data sets. These include scientific data sets, such as in astronomics, genomics, as well as data produced by individuals, such as internet search logs, sensor networks, and social network feeds. We view the generation of such data as the sampling process from a big source, which is a random variable of size at least a few gigabytes. Our view initiates the study of big sources in the randomness extraction literature. Previous approaches for big sources rely on statistical assumptions about the samples. We introduce a general method that provably extracts almost-uniform random bits from big sources and extensively validate it empirically on real data sets. The experimental findings indicate that our method is efficient enough to handle large enough sources, while previous extractor constructions are not efficient enough to be practical. Quality-wise, our method at least matches quantum randomness expanders and classical world empirical extractors as measured by standardized tests. Nature Publishing Group 2016-09-26 /pmc/articles/PMC5036032/ /pubmed/27666514 http://dx.doi.org/10.1038/srep33740 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle	Article Papakonstantinou, Periklis A. Woodruff, David P. Yang, Guang True Randomness from Big Data
title	True Randomness from Big Data
title_full	True Randomness from Big Data
title_fullStr	True Randomness from Big Data
title_full_unstemmed	True Randomness from Big Data
title_short	True Randomness from Big Data
title_sort	true randomness from big data
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5036032/ https://www.ncbi.nlm.nih.gov/pubmed/27666514 http://dx.doi.org/10.1038/srep33740
work_keys_str_mv	AT papakonstantinouperiklisa truerandomnessfrombigdata AT woodruffdavidp truerandomnessfrombigdata AT yangguang truerandomnessfrombigdata

True Randomness from Big Data

Ejemplares similares