Cargando…
Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics
BACKGROUND: Spatial and space–time scan statistics are widely used in disease surveillance to identify geographical areas of elevated disease risk and for the early detection of disease outbreaks. With a scan statistic, a scanning window of variable location and size moves across the map to evaluate...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4971627/ https://www.ncbi.nlm.nih.gov/pubmed/27488416 http://dx.doi.org/10.1186/s12942-016-0056-6 |
_version_ | 1782446136175362048 |
---|---|
author | Han, Junhee Zhu, Li Kulldorff, Martin Hostovich, Scott Stinchcomb, David G. Tatalovich, Zaria Lewis, Denise Riedel Feuer, Eric J. |
author_facet | Han, Junhee Zhu, Li Kulldorff, Martin Hostovich, Scott Stinchcomb, David G. Tatalovich, Zaria Lewis, Denise Riedel Feuer, Eric J. |
author_sort | Han, Junhee |
collection | PubMed |
description | BACKGROUND: Spatial and space–time scan statistics are widely used in disease surveillance to identify geographical areas of elevated disease risk and for the early detection of disease outbreaks. With a scan statistic, a scanning window of variable location and size moves across the map to evaluate thousands of overlapping windows as potential clusters, adjusting for the multiple testing. Almost always, the method will find many very similar overlapping clusters, and it is not useful to report all of them. This paper proposes to use the Gini coefficient to help select which of the many overlapping clusters to report. METHODS: The Gini coefficient provides a quick and intuitive way to evaluate the degree of the heterogeneity of the collection of clusters, which is useful to explain how well the cluster collection reveal the underlying true cluster patterns. Using simulation studies and real cancer mortality data, it is compared with the traditional approach for reporting non-overlapping clusters. RESULTS: The Gini coefficient can identify a more refined collection of non-overlapping clusters to report. For example, it is able to determine when it makes more sense to report a collection of smaller non-overlapping clusters versus a single large cluster containing all of them. It also fulfils a set of desirable theoretical properties, such as being invariant under a uniform multiplication of the population numbers by the same constant. CONCLUSIONS: The Gini coefficient can be used to determine which set of non-overlapping clusters to report. It has been implemented in the free SaTScan™ software version 9.3 (www.satscan.org). |
format | Online Article Text |
id | pubmed-4971627 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-49716272016-08-04 Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics Han, Junhee Zhu, Li Kulldorff, Martin Hostovich, Scott Stinchcomb, David G. Tatalovich, Zaria Lewis, Denise Riedel Feuer, Eric J. Int J Health Geogr Methodology BACKGROUND: Spatial and space–time scan statistics are widely used in disease surveillance to identify geographical areas of elevated disease risk and for the early detection of disease outbreaks. With a scan statistic, a scanning window of variable location and size moves across the map to evaluate thousands of overlapping windows as potential clusters, adjusting for the multiple testing. Almost always, the method will find many very similar overlapping clusters, and it is not useful to report all of them. This paper proposes to use the Gini coefficient to help select which of the many overlapping clusters to report. METHODS: The Gini coefficient provides a quick and intuitive way to evaluate the degree of the heterogeneity of the collection of clusters, which is useful to explain how well the cluster collection reveal the underlying true cluster patterns. Using simulation studies and real cancer mortality data, it is compared with the traditional approach for reporting non-overlapping clusters. RESULTS: The Gini coefficient can identify a more refined collection of non-overlapping clusters to report. For example, it is able to determine when it makes more sense to report a collection of smaller non-overlapping clusters versus a single large cluster containing all of them. It also fulfils a set of desirable theoretical properties, such as being invariant under a uniform multiplication of the population numbers by the same constant. CONCLUSIONS: The Gini coefficient can be used to determine which set of non-overlapping clusters to report. It has been implemented in the free SaTScan™ software version 9.3 (www.satscan.org). BioMed Central 2016-08-03 /pmc/articles/PMC4971627/ /pubmed/27488416 http://dx.doi.org/10.1186/s12942-016-0056-6 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Han, Junhee Zhu, Li Kulldorff, Martin Hostovich, Scott Stinchcomb, David G. Tatalovich, Zaria Lewis, Denise Riedel Feuer, Eric J. Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics |
title | Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics |
title_full | Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics |
title_fullStr | Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics |
title_full_unstemmed | Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics |
title_short | Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics |
title_sort | using gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4971627/ https://www.ncbi.nlm.nih.gov/pubmed/27488416 http://dx.doi.org/10.1186/s12942-016-0056-6 |
work_keys_str_mv | AT hanjunhee usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics AT zhuli usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics AT kulldorffmartin usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics AT hostovichscott usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics AT stinchcombdavidg usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics AT tatalovichzaria usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics AT lewisdeniseriedel usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics AT feuerericj usingginicoefficienttodeterminingoptimalclusterreportingsizesforspatialscanstatistics |