Cargando…

Trend clustering from COVID-19 tweets using graphical lasso-guided iterative principal component analysis

This article presents a method for trend clustering from tweets about coronavirus disease (COVID-19) to help us objectively review the past and make decisions about future countermeasures. We aim to avoid detecting usual trends based on seasonal events while detecting essential trends caused by the...

Descripción completa

Detalles Bibliográficos
Autores principales: Harakawa, Ryosuke, Ito, Tsutomu, Iwahashi, Masahiro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8982667/
https://www.ncbi.nlm.nih.gov/pubmed/35383245
http://dx.doi.org/10.1038/s41598-022-09651-6
_version_ 1784681852528754688
author Harakawa, Ryosuke
Ito, Tsutomu
Iwahashi, Masahiro
author_facet Harakawa, Ryosuke
Ito, Tsutomu
Iwahashi, Masahiro
author_sort Harakawa, Ryosuke
collection PubMed
description This article presents a method for trend clustering from tweets about coronavirus disease (COVID-19) to help us objectively review the past and make decisions about future countermeasures. We aim to avoid detecting usual trends based on seasonal events while detecting essential trends caused by the influence of COVID-19. To this aim, we regard daily changes in the frequencies of each word in tweets as time series signals and define time series signals with single peaks as target trends. To successfully cluster the target trends, we propose graphical lasso-guided iterative principal component analysis (GLIPCA). GLIPCA enables us to remove trends with indirect correlations generated by other essential trends. Moreover, GLIPCA overcomes the difficulty in the quantitative evaluation of the accuracy of trend clustering. Thus, GLIPCA’s parameters are easier to determine than those of other clustering methods. We conducted experiments using Japanese tweets about COVID-19 from March 8, 2020, to May 7, 2020. The results show that GLIPCA successfully distinguished trends before and after the declaration of a state of emergency on April 7, 2020. In addition, the results reveal the international argument about whether the Tokyo 2020 Summer Olympics should be held. The results suggest the tremendous social impact of the words and actions of Japanese celebrities. Furthermore, the results suggest that people’s attention moved from worry and fear of an unknown novel pneumonia to the need for medical care and a new lifestyle as well as the scientific characteristics of COVID-19.
format Online
Article
Text
id pubmed-8982667
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-89826672022-04-06 Trend clustering from COVID-19 tweets using graphical lasso-guided iterative principal component analysis Harakawa, Ryosuke Ito, Tsutomu Iwahashi, Masahiro Sci Rep Article This article presents a method for trend clustering from tweets about coronavirus disease (COVID-19) to help us objectively review the past and make decisions about future countermeasures. We aim to avoid detecting usual trends based on seasonal events while detecting essential trends caused by the influence of COVID-19. To this aim, we regard daily changes in the frequencies of each word in tweets as time series signals and define time series signals with single peaks as target trends. To successfully cluster the target trends, we propose graphical lasso-guided iterative principal component analysis (GLIPCA). GLIPCA enables us to remove trends with indirect correlations generated by other essential trends. Moreover, GLIPCA overcomes the difficulty in the quantitative evaluation of the accuracy of trend clustering. Thus, GLIPCA’s parameters are easier to determine than those of other clustering methods. We conducted experiments using Japanese tweets about COVID-19 from March 8, 2020, to May 7, 2020. The results show that GLIPCA successfully distinguished trends before and after the declaration of a state of emergency on April 7, 2020. In addition, the results reveal the international argument about whether the Tokyo 2020 Summer Olympics should be held. The results suggest the tremendous social impact of the words and actions of Japanese celebrities. Furthermore, the results suggest that people’s attention moved from worry and fear of an unknown novel pneumonia to the need for medical care and a new lifestyle as well as the scientific characteristics of COVID-19. Nature Publishing Group UK 2022-04-05 /pmc/articles/PMC8982667/ /pubmed/35383245 http://dx.doi.org/10.1038/s41598-022-09651-6 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Harakawa, Ryosuke
Ito, Tsutomu
Iwahashi, Masahiro
Trend clustering from COVID-19 tweets using graphical lasso-guided iterative principal component analysis
title Trend clustering from COVID-19 tweets using graphical lasso-guided iterative principal component analysis
title_full Trend clustering from COVID-19 tweets using graphical lasso-guided iterative principal component analysis
title_fullStr Trend clustering from COVID-19 tweets using graphical lasso-guided iterative principal component analysis
title_full_unstemmed Trend clustering from COVID-19 tweets using graphical lasso-guided iterative principal component analysis
title_short Trend clustering from COVID-19 tweets using graphical lasso-guided iterative principal component analysis
title_sort trend clustering from covid-19 tweets using graphical lasso-guided iterative principal component analysis
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8982667/
https://www.ncbi.nlm.nih.gov/pubmed/35383245
http://dx.doi.org/10.1038/s41598-022-09651-6
work_keys_str_mv AT harakawaryosuke trendclusteringfromcovid19tweetsusinggraphicallassoguidediterativeprincipalcomponentanalysis
AT itotsutomu trendclusteringfromcovid19tweetsusinggraphicallassoguidediterativeprincipalcomponentanalysis
AT iwahashimasahiro trendclusteringfromcovid19tweetsusinggraphicallassoguidediterativeprincipalcomponentanalysis