Cargando…

Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics

BACKGROUND: Spatial and space–time scan statistics are widely used in disease surveillance to identify geographical areas of elevated disease risk and for the early detection of disease outbreaks. With a scan statistic, a scanning window of variable location and size moves across the map to evaluate...

Descripción completa

Detalles Bibliográficos
Autores principales: Han, Junhee, Zhu, Li, Kulldorff, Martin, Hostovich, Scott, Stinchcomb, David G., Tatalovich, Zaria, Lewis, Denise Riedel, Feuer, Eric J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4971627/
https://www.ncbi.nlm.nih.gov/pubmed/27488416
http://dx.doi.org/10.1186/s12942-016-0056-6
_version_ 1782446136175362048
author Han, Junhee
Zhu, Li
Kulldorff, Martin
Hostovich, Scott
Stinchcomb, David G.
Tatalovich, Zaria
Lewis, Denise Riedel
Feuer, Eric J.
author_facet Han, Junhee
Zhu, Li
Kulldorff, Martin
Hostovich, Scott
Stinchcomb, David G.
Tatalovich, Zaria
Lewis, Denise Riedel
Feuer, Eric J.
author_sort Han, Junhee
collection PubMed
description BACKGROUND: Spatial and space–time scan statistics are widely used in disease surveillance to identify geographical areas of elevated disease risk and for the early detection of disease outbreaks. With a scan statistic, a scanning window of variable location and size moves across the map to evaluate thousands of overlapping windows as potential clusters, adjusting for the multiple testing. Almost always, the method will find many very similar overlapping clusters, and it is not useful to report all of them. This paper proposes to use the Gini coefficient to help select which of the many overlapping clusters to report. METHODS: The Gini coefficient provides a quick and intuitive way to evaluate the degree of the heterogeneity of the collection of clusters, which is useful to explain how well the cluster collection reveal the underlying true cluster patterns. Using simulation studies and real cancer mortality data, it is compared with the traditional approach for reporting non-overlapping clusters. RESULTS: The Gini coefficient can identify a more refined collection of non-overlapping clusters to report. For example, it is able to determine when it makes more sense to report a collection of smaller non-overlapping clusters versus a single large cluster containing all of them. It also fulfils a set of desirable theoretical properties, such as being invariant under a uniform multiplication of the population numbers by the same constant. CONCLUSIONS: The Gini coefficient can be used to determine which set of non-overlapping clusters to report. It has been implemented in the free SaTScan™ software version 9.3 (www.satscan.org).
format Online
Article
Text
id pubmed-4971627
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49716272016-08-04 Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics Han, Junhee Zhu, Li Kulldorff, Martin Hostovich, Scott Stinchcomb, David G. Tatalovich, Zaria Lewis, Denise Riedel Feuer, Eric J. Int J Health Geogr Methodology BACKGROUND: Spatial and space–time scan statistics are widely used in disease surveillance to identify geographical areas of elevated disease risk and for the early detection of disease outbreaks. With a scan statistic, a scanning window of variable location and size moves across the map to evaluate thousands of overlapping windows as potential clusters, adjusting for the multiple testing. Almost always, the method will find many very similar overlapping clusters, and it is not useful to report all of them. This paper proposes to use the Gini coefficient to help select which of the many overlapping clusters to report. METHODS: The Gini coefficient provides a quick and intuitive way to evaluate the degree of the heterogeneity of the collection of clusters, which is useful to explain how well the cluster collection reveal the underlying true cluster patterns. Using simulation studies and real cancer mortality data, it is compared with the traditional approach for reporting non-overlapping clusters. RESULTS: The Gini coefficient can identify a more refined collection of non-overlapping clusters to report. For example, it is able to determine when it makes more sense to report a collection of smaller non-overlapping clusters versus a single large cluster containing all of them. It also fulfils a set of desirable theoretical properties, such as being invariant under a uniform multiplication of the population numbers by the same constant. CONCLUSIONS: The Gini coefficient can be used to determine which set of non-overlapping clusters to report. It has been implemented in the free SaTScan™ software version 9.3 (www.satscan.org). BioMed Central 2016-08-03 /pmc/articles/PMC4971627/ /pubmed/27488416 http://dx.doi.org/10.1186/s12942-016-0056-6 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Han, Junhee
Zhu, Li
Kulldorff, Martin
Hostovich, Scott
Stinchcomb, David G.
Tatalovich, Zaria
Lewis, Denise Riedel
Feuer, Eric J.
Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics
title Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics
title_full Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics
title_fullStr Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics
title_full_unstemmed Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics
title_short Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics
title_sort using gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4971627/
https://www.ncbi.nlm.nih.gov/pubmed/27488416
http://dx.doi.org/10.1186/s12942-016-0056-6
work_keys_str_mv AT hanjunhee usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics
AT zhuli usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics
AT kulldorffmartin usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics
AT hostovichscott usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics
AT stinchcombdavidg usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics
AT tatalovichzaria usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics
AT lewisdeniseriedel usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics
AT feuerericj usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics