Cargando…

Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic

The spatial scan statistic has been widely used to detect spatial clusters that are of common interest in many health-related problems. However, in most situations, different scan parameters, especially the maximum window size (MWS), result in obtaining different detected clusters. Although performa...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Wei, Zhang, Tao, Yin, Fei, Xiao, Xiong, Chen, Shiqi, Zhang, Xingyu, Li, Xiaosong, Ma, Yue
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7078301/
https://www.ncbi.nlm.nih.gov/pubmed/32184455
http://dx.doi.org/10.1038/s41598-020-61829-y
_version_ 1783507591336296448
author Wang, Wei
Zhang, Tao
Yin, Fei
Xiao, Xiong
Chen, Shiqi
Zhang, Xingyu
Li, Xiaosong
Ma, Yue
author_facet Wang, Wei
Zhang, Tao
Yin, Fei
Xiao, Xiong
Chen, Shiqi
Zhang, Xingyu
Li, Xiaosong
Ma, Yue
author_sort Wang, Wei
collection PubMed
description The spatial scan statistic has been widely used to detect spatial clusters that are of common interest in many health-related problems. However, in most situations, different scan parameters, especially the maximum window size (MWS), result in obtaining different detected clusters. Although performance measures can select an optimal scan parameter, most of them depend on historical prior or true cluster information, which is usually unavailable in practical datasets. Currently, the Gini coefficient and the maximum clustering set-proportion statistic (MCS-P) are used to select appropriate parameters without any prior information. However, the Gini coefficient may be unstable and select inappropriate parameters, especially in complex practical datasets, while the MCS-P may have unsatisfactory performance in spatial datasets with heterogeneous clusters. Based on the MCS-P, we proposed a new indicator, the maximum clustering heterogeneous set-proportion (MCHS-P). A simulation study of selecting the optimal MWS confirmed that in spatial datasets with heterogeneous clusters, the MWSs selected using the MCHS-P have much better performance than those selected using the MCS-P; moreover, higher heterogeneity led to a larger advantage of the MCHS-P, with up to 538% and 69.5% improvement in the Youden's index and misclassification in specific scenarios, respectively. Meanwhile, the MCHS-P maintains similar performance to that of the MCS-P in spatial datasets with homogeneous clusters. Furthermore, the MCHS-P has significant improvements over the Gini coefficient and the default 50% MWS, especially in datasets with clusters that are not far from each other. Two practical studies showed similar results to those obtained in the simulation study. In the case where there is no prior information about the true clusters or the heterogeneity between the clusters, the MCHS-P is recommended to select the MWS in order to accurately identify spatial clusters.
format Online
Article
Text
id pubmed-7078301
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-70783012020-03-23 Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic Wang, Wei Zhang, Tao Yin, Fei Xiao, Xiong Chen, Shiqi Zhang, Xingyu Li, Xiaosong Ma, Yue Sci Rep Article The spatial scan statistic has been widely used to detect spatial clusters that are of common interest in many health-related problems. However, in most situations, different scan parameters, especially the maximum window size (MWS), result in obtaining different detected clusters. Although performance measures can select an optimal scan parameter, most of them depend on historical prior or true cluster information, which is usually unavailable in practical datasets. Currently, the Gini coefficient and the maximum clustering set-proportion statistic (MCS-P) are used to select appropriate parameters without any prior information. However, the Gini coefficient may be unstable and select inappropriate parameters, especially in complex practical datasets, while the MCS-P may have unsatisfactory performance in spatial datasets with heterogeneous clusters. Based on the MCS-P, we proposed a new indicator, the maximum clustering heterogeneous set-proportion (MCHS-P). A simulation study of selecting the optimal MWS confirmed that in spatial datasets with heterogeneous clusters, the MWSs selected using the MCHS-P have much better performance than those selected using the MCS-P; moreover, higher heterogeneity led to a larger advantage of the MCHS-P, with up to 538% and 69.5% improvement in the Youden's index and misclassification in specific scenarios, respectively. Meanwhile, the MCHS-P maintains similar performance to that of the MCS-P in spatial datasets with homogeneous clusters. Furthermore, the MCHS-P has significant improvements over the Gini coefficient and the default 50% MWS, especially in datasets with clusters that are not far from each other. Two practical studies showed similar results to those obtained in the simulation study. In the case where there is no prior information about the true clusters or the heterogeneity between the clusters, the MCHS-P is recommended to select the MWS in order to accurately identify spatial clusters. Nature Publishing Group UK 2020-03-17 /pmc/articles/PMC7078301/ /pubmed/32184455 http://dx.doi.org/10.1038/s41598-020-61829-y Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Wang, Wei
Zhang, Tao
Yin, Fei
Xiao, Xiong
Chen, Shiqi
Zhang, Xingyu
Li, Xiaosong
Ma, Yue
Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic
title Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic
title_full Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic
title_fullStr Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic
title_full_unstemmed Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic
title_short Using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic
title_sort using the maximum clustering heterogeneous set-proportion to select the maximum window size for the spatial scan statistic
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7078301/
https://www.ncbi.nlm.nih.gov/pubmed/32184455
http://dx.doi.org/10.1038/s41598-020-61829-y
work_keys_str_mv AT wangwei usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic
AT zhangtao usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic
AT yinfei usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic
AT xiaoxiong usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic
AT chenshiqi usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic
AT zhangxingyu usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic
AT lixiaosong usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic
AT mayue usingthemaximumclusteringheterogeneoussetproportiontoselectthemaximumwindowsizeforthespatialscanstatistic