Cargando…

Deeper Spatial Statistical Insights into Small Geographic Area Data Uncertainty

Small areas refer to small geographic areas, a more literal meaning of the phrase, as well as small domains (e.g., small sub-populations), a more figurative meaning of the phrase. With post-stratification, even with big data, either case can encounter the problem of small local sample sizes, which t...

Descripción completa

Detalles Bibliográficos
Autores principales: Griffith, Daniel A., Chun, Yongwan, Lee, Monghyeon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7795520/
https://www.ncbi.nlm.nih.gov/pubmed/33396823
http://dx.doi.org/10.3390/ijerph18010231
_version_ 1783634464330481664
author Griffith, Daniel A.
Chun, Yongwan
Lee, Monghyeon
author_facet Griffith, Daniel A.
Chun, Yongwan
Lee, Monghyeon
author_sort Griffith, Daniel A.
collection PubMed
description Small areas refer to small geographic areas, a more literal meaning of the phrase, as well as small domains (e.g., small sub-populations), a more figurative meaning of the phrase. With post-stratification, even with big data, either case can encounter the problem of small local sample sizes, which tend to inflate local uncertainty and undermine otherwise sound statistical analyses. This condition is the opposite of that afflicting statistical significance in the context of big data. These two definitions can also occur jointly, such as during the standardization of data: small geographic units may contain small populations, which in turn have small counts in various age cohorts. Accordingly, big spatial data can become not-so-big spatial data after post-stratification by geography and, for example, by age cohorts. This situation can be ameliorated to some degree by the large volume of and high velocity of big spatial data. However, the variety of any big spatial data may well exacerbate this situation, compromising veracity in terms of bias, noise, and abnormalities in these data. The purpose of this paper is to establish deeper insights into big spatial data with regard to their uncertainty through one of the hallmarks of georeferenced data, namely spatial autocorrelation, coupled with small geographic areas. Impacts of interest concern the nature, degree, and mixture of spatial autocorrelation. The cancer data employed (from Florida for 2001–2010) represent a data category that is beginning to enter the realm of big spatial data; its volume, velocity, and variety are increasing through the widespread use of digital medical records.
format Online
Article
Text
id pubmed-7795520
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-77955202021-01-10 Deeper Spatial Statistical Insights into Small Geographic Area Data Uncertainty Griffith, Daniel A. Chun, Yongwan Lee, Monghyeon Int J Environ Res Public Health Article Small areas refer to small geographic areas, a more literal meaning of the phrase, as well as small domains (e.g., small sub-populations), a more figurative meaning of the phrase. With post-stratification, even with big data, either case can encounter the problem of small local sample sizes, which tend to inflate local uncertainty and undermine otherwise sound statistical analyses. This condition is the opposite of that afflicting statistical significance in the context of big data. These two definitions can also occur jointly, such as during the standardization of data: small geographic units may contain small populations, which in turn have small counts in various age cohorts. Accordingly, big spatial data can become not-so-big spatial data after post-stratification by geography and, for example, by age cohorts. This situation can be ameliorated to some degree by the large volume of and high velocity of big spatial data. However, the variety of any big spatial data may well exacerbate this situation, compromising veracity in terms of bias, noise, and abnormalities in these data. The purpose of this paper is to establish deeper insights into big spatial data with regard to their uncertainty through one of the hallmarks of georeferenced data, namely spatial autocorrelation, coupled with small geographic areas. Impacts of interest concern the nature, degree, and mixture of spatial autocorrelation. The cancer data employed (from Florida for 2001–2010) represent a data category that is beginning to enter the realm of big spatial data; its volume, velocity, and variety are increasing through the widespread use of digital medical records. MDPI 2020-12-30 2021-01 /pmc/articles/PMC7795520/ /pubmed/33396823 http://dx.doi.org/10.3390/ijerph18010231 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Griffith, Daniel A.
Chun, Yongwan
Lee, Monghyeon
Deeper Spatial Statistical Insights into Small Geographic Area Data Uncertainty
title Deeper Spatial Statistical Insights into Small Geographic Area Data Uncertainty
title_full Deeper Spatial Statistical Insights into Small Geographic Area Data Uncertainty
title_fullStr Deeper Spatial Statistical Insights into Small Geographic Area Data Uncertainty
title_full_unstemmed Deeper Spatial Statistical Insights into Small Geographic Area Data Uncertainty
title_short Deeper Spatial Statistical Insights into Small Geographic Area Data Uncertainty
title_sort deeper spatial statistical insights into small geographic area data uncertainty
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7795520/
https://www.ncbi.nlm.nih.gov/pubmed/33396823
http://dx.doi.org/10.3390/ijerph18010231
work_keys_str_mv AT griffithdaniela deeperspatialstatisticalinsightsintosmallgeographicareadatauncertainty
AT chunyongwan deeperspatialstatisticalinsightsintosmallgeographicareadatauncertainty
AT leemonghyeon deeperspatialstatisticalinsightsintosmallgeographicareadatauncertainty