Cargando…
Generalized [Formula: see text]-means in GLMs with applications to the outbreak of COVID-19 in the United States
Generalized [Formula: see text]-means can be combined with any similarity or dissimilarity measure for clustering. Using the well known likelihood ratio or [Formula: see text]-statistic as the dissimilarity measure, a generalized [Formula: see text]-means method is proposed to group generalized line...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier B.V.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7943386/ https://www.ncbi.nlm.nih.gov/pubmed/33723467 http://dx.doi.org/10.1016/j.csda.2021.107217 |
_version_ | 1783662480424173568 |
---|---|
author | Zhang, Tonglin Lin, Ge |
author_facet | Zhang, Tonglin Lin, Ge |
author_sort | Zhang, Tonglin |
collection | PubMed |
description | Generalized [Formula: see text]-means can be combined with any similarity or dissimilarity measure for clustering. Using the well known likelihood ratio or [Formula: see text]-statistic as the dissimilarity measure, a generalized [Formula: see text]-means method is proposed to group generalized linear models (GLMs) for exponential family distributions. Given the number of clusters [Formula: see text] , the proposed method is established by the uniform most powerful unbiased (UMPU) test statistic for the comparison between GLMs. If [Formula: see text] is unknown, then the proposed method can be combined with generalized liformation criterion (GIC) to automatically select the best [Formula: see text] for clustering. Both AIC and BIC are investigated as special cases of GIC. Theoretical and simulation results show that the number of clusters can be correctly identified by BIC but not AIC. The proposed method is applied to the state-level daily COVID-19 data in the United States, and it identifies 6 clusters. A further study shows that the models between clusters are significantly different from each other, which confirms the result with 6 clusters. |
format | Online Article Text |
id | pubmed-7943386 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Elsevier B.V. |
record_format | MEDLINE/PubMed |
spelling | pubmed-79433862021-03-11 Generalized [Formula: see text]-means in GLMs with applications to the outbreak of COVID-19 in the United States Zhang, Tonglin Lin, Ge Comput Stat Data Anal Article Generalized [Formula: see text]-means can be combined with any similarity or dissimilarity measure for clustering. Using the well known likelihood ratio or [Formula: see text]-statistic as the dissimilarity measure, a generalized [Formula: see text]-means method is proposed to group generalized linear models (GLMs) for exponential family distributions. Given the number of clusters [Formula: see text] , the proposed method is established by the uniform most powerful unbiased (UMPU) test statistic for the comparison between GLMs. If [Formula: see text] is unknown, then the proposed method can be combined with generalized liformation criterion (GIC) to automatically select the best [Formula: see text] for clustering. Both AIC and BIC are investigated as special cases of GIC. Theoretical and simulation results show that the number of clusters can be correctly identified by BIC but not AIC. The proposed method is applied to the state-level daily COVID-19 data in the United States, and it identifies 6 clusters. A further study shows that the models between clusters are significantly different from each other, which confirms the result with 6 clusters. Elsevier B.V. 2021-07 2021-03-10 /pmc/articles/PMC7943386/ /pubmed/33723467 http://dx.doi.org/10.1016/j.csda.2021.107217 Text en © 2021 Elsevier B.V. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Zhang, Tonglin Lin, Ge Generalized [Formula: see text]-means in GLMs with applications to the outbreak of COVID-19 in the United States |
title | Generalized [Formula: see text]-means in GLMs with applications to the outbreak of COVID-19 in the United States |
title_full | Generalized [Formula: see text]-means in GLMs with applications to the outbreak of COVID-19 in the United States |
title_fullStr | Generalized [Formula: see text]-means in GLMs with applications to the outbreak of COVID-19 in the United States |
title_full_unstemmed | Generalized [Formula: see text]-means in GLMs with applications to the outbreak of COVID-19 in the United States |
title_short | Generalized [Formula: see text]-means in GLMs with applications to the outbreak of COVID-19 in the United States |
title_sort | generalized [formula: see text]-means in glms with applications to the outbreak of covid-19 in the united states |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7943386/ https://www.ncbi.nlm.nih.gov/pubmed/33723467 http://dx.doi.org/10.1016/j.csda.2021.107217 |
work_keys_str_mv | AT zhangtonglin generalizedformulaseetextmeansinglmswithapplicationstotheoutbreakofcovid19intheunitedstates AT linge generalizedformulaseetextmeansinglmswithapplicationstotheoutbreakofcovid19intheunitedstates |