Cargando…
Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic
The spatial scan statistic has been widely used to detect spatial clusters that are of common interest in many health-related problems. However, in most situations, different scan parameters, especially the maximum window size (MWS), result in obtaining different detected clusters. Although performa...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7078301/ https://www.ncbi.nlm.nih.gov/pubmed/32184455 http://dx.doi.org/10.1038/s41598-020-61829-y |
_version_ | 1783507591336296448 |
---|---|
author | Wang, Wei Zhang, Tao Yin, Fei Xiao, Xiong Chen, Shiqi Zhang, Xingyu Li, Xiaosong Ma, Yue |
author_facet | Wang, Wei Zhang, Tao Yin, Fei Xiao, Xiong Chen, Shiqi Zhang, Xingyu Li, Xiaosong Ma, Yue |
author_sort | Wang, Wei |
collection | PubMed |
description | The spatial scan statistic has been widely used to detect spatial clusters that are of common interest in many health-related problems. However, in most situations, different scan parameters, especially the maximum window size (MWS), result in obtaining different detected clusters. Although performance measures can select an optimal scan parameter, most of them depend on historical prior or true cluster information, which is usually unavailable in practical datasets. Currently, the Gini coefficient and the maximum clustering set-proportion statistic (MCS-P) are used to select appropriate parameters without any prior information. However, the Gini coefficient may be unstable and select inappropriate parameters, especially in complex practical datasets, while the MCS-P may have unsatisfactory performance in spatial datasets with heterogeneous clusters. Based on the MCS-P, we proposed a new indicator, the maximum clustering heterogeneous set-proportion (MCHS-P). A simulation study of selecting the optimal MWS confirmed that in spatial datasets with heterogeneous clusters, the MWSs selected using the MCHS-P have much better performance than those selected using the MCS-P; moreover, higher heterogeneity led to a larger advantage of the MCHS-P, with up to 538% and 69.5% improvement in the Youden's index and misclassification in specific scenarios, respectively. Meanwhile, the MCHS-P maintains similar performance to that of the MCS-P in spatial datasets with homogeneous clusters. Furthermore, the MCHS-P has significant improvements over the Gini coefficient and the default 50% MWS, especially in datasets with clusters that are not far from each other. Two practical studies showed similar results to those obtained in the simulation study. In the case where there is no prior information about the true clusters or the heterogeneity between the clusters, the MCHS-P is recommended to select the MWS in order to accurately identify spatial clusters. |
format | Online Article Text |
id | pubmed-7078301 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-70783012020-03-23 Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic Wang, Wei Zhang, Tao Yin, Fei Xiao, Xiong Chen, Shiqi Zhang, Xingyu Li, Xiaosong Ma, Yue Sci Rep Article The spatial scan statistic has been widely used to detect spatial clusters that are of common interest in many health-related problems. However, in most situations, different scan parameters, especially the maximum window size (MWS), result in obtaining different detected clusters. Although performance measures can select an optimal scan parameter, most of them depend on historical prior or true cluster information, which is usually unavailable in practical datasets. Currently, the Gini coefficient and the maximum clustering set-proportion statistic (MCS-P) are used to select appropriate parameters without any prior information. However, the Gini coefficient may be unstable and select inappropriate parameters, especially in complex practical datasets, while the MCS-P may have unsatisfactory performance in spatial datasets with heterogeneous clusters. Based on the MCS-P, we proposed a new indicator, the maximum clustering heterogeneous set-proportion (MCHS-P). A simulation study of selecting the optimal MWS confirmed that in spatial datasets with heterogeneous clusters, the MWSs selected using the MCHS-P have much better performance than those selected using the MCS-P; moreover, higher heterogeneity led to a larger advantage of the MCHS-P, with up to 538% and 69.5% improvement in the Youden's index and misclassification in specific scenarios, respectively. Meanwhile, the MCHS-P maintains similar performance to that of the MCS-P in spatial datasets with homogeneous clusters. Furthermore, the MCHS-P has significant improvements over the Gini coefficient and the default 50% MWS, especially in datasets with clusters that are not far from each other. Two practical studies showed similar results to those obtained in the simulation study. In the case where there is no prior information about the true clusters or the heterogeneity between the clusters, the MCHS-P is recommended to select the MWS in order to accurately identify spatial clusters. Nature Publishing Group UK 2020-03-17 /pmc/articles/PMC7078301/ /pubmed/32184455 http://dx.doi.org/10.1038/s41598-020-61829-y Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Wang, Wei Zhang, Tao Yin, Fei Xiao, Xiong Chen, Shiqi Zhang, Xingyu Li, Xiaosong Ma, Yue Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic |
title | Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic |
title_full | Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic |
title_fullStr | Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic |
title_full_unstemmed | Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic |
title_short | Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic |
title_sort | using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7078301/ https://www.ncbi.nlm.nih.gov/pubmed/32184455 http://dx.doi.org/10.1038/s41598-020-61829-y |
work_keys_str_mv | AT wangwei usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic AT zhangtao usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic AT yinfei usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic AT xiaoxiong usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic AT chenshiqi usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic AT zhangxingyu usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic AT lixiaosong usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic AT mayue usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic |