Cargando…

Topic Modeling of Social Networking Service Data on Occupational Accidents in Korea: Latent Dirichlet Allocation Analysis

BACKGROUND: In most industrialized societies, regulations, inspections, insurance, and legal options are established to support workers who suffer injury, disease, or death in relation to their work; in practice, these resources are imperfect or even unavailable due to workplace or employer obstruct...

Descripción completa

Detalles Bibliográficos
Autores principales: Min, Kyoung-Bok, Song, Sung-Hee, Min, Jin-Young
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7453332/
https://www.ncbi.nlm.nih.gov/pubmed/32663156
http://dx.doi.org/10.2196/19222
_version_ 1783575339612504064
author Min, Kyoung-Bok
Song, Sung-Hee
Min, Jin-Young
author_facet Min, Kyoung-Bok
Song, Sung-Hee
Min, Jin-Young
author_sort Min, Kyoung-Bok
collection PubMed
description BACKGROUND: In most industrialized societies, regulations, inspections, insurance, and legal options are established to support workers who suffer injury, disease, or death in relation to their work; in practice, these resources are imperfect or even unavailable due to workplace or employer obstruction. Thus, limitations exist to identify unmet needs in occupational safety and health information. OBJECTIVE: The aim of this study was to explore hidden issues related to occupational accidents by examining social network services (SNS) data using topic modeling. METHODS: Based on the results of a Google search for the phrases occupational accident, industrial accident and occupational diseases, a total of 145 websites were selected. From among these websites, we collected 15,244 documents on queries related to occupational accidents between 2002 and 2018. To transform unstructured text into structure data, natural language processing of the Korean language was conducted. We performed the latent Dirichlet allocation (LDA) as a topic model using a Python library. A time-series linear regression analysis was also conducted to identify yearly trends for the given documents. RESULTS: The results of the LDA model showed 14 topics with 3 themes: workers’ compensation benefits (Theme 1), illicit agreements with the employer (Theme 2), and fatal and non-fatal injuries and vulnerable workers (Theme 3). Theme 1 represented the largest cluster (52.2%) of the collected documents and included keywords related to workers’ compensation (ie, company, occupational injury, insurance, accident, approval, and compensation) and keywords describing specific compensation benefits such as medical expense benefits, temporary incapacity benefits, and disability benefits. In the yearly trend, Theme 1 gradually decreased; however, other themes showed an overall increasing pattern. Certain queries (ie, musculoskeletal system, critical care, and foreign workers) showed no significant variation in the number of queries. CONCLUSIONS: We conducted LDA analysis of SNS data of occupational accident–related queries and discovered that the primary concerns of workers posting about occupational injuries and diseases were workers’ compensation benefits, fatal and non-fatal injuries, vulnerable workers, and illicit agreements with employers. While traditional systems focus mainly on quantitative monitoring of occupational accidents, qualitative aspects formulated by topic modeling from unstructured SNS queries may be valuable to address inequalities and improve occupational health and safety.
format Online
Article
Text
id pubmed-7453332
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-74533322020-08-31 Topic Modeling of Social Networking Service Data on Occupational Accidents in Korea: Latent Dirichlet Allocation Analysis Min, Kyoung-Bok Song, Sung-Hee Min, Jin-Young J Med Internet Res Original Paper BACKGROUND: In most industrialized societies, regulations, inspections, insurance, and legal options are established to support workers who suffer injury, disease, or death in relation to their work; in practice, these resources are imperfect or even unavailable due to workplace or employer obstruction. Thus, limitations exist to identify unmet needs in occupational safety and health information. OBJECTIVE: The aim of this study was to explore hidden issues related to occupational accidents by examining social network services (SNS) data using topic modeling. METHODS: Based on the results of a Google search for the phrases occupational accident, industrial accident and occupational diseases, a total of 145 websites were selected. From among these websites, we collected 15,244 documents on queries related to occupational accidents between 2002 and 2018. To transform unstructured text into structure data, natural language processing of the Korean language was conducted. We performed the latent Dirichlet allocation (LDA) as a topic model using a Python library. A time-series linear regression analysis was also conducted to identify yearly trends for the given documents. RESULTS: The results of the LDA model showed 14 topics with 3 themes: workers’ compensation benefits (Theme 1), illicit agreements with the employer (Theme 2), and fatal and non-fatal injuries and vulnerable workers (Theme 3). Theme 1 represented the largest cluster (52.2%) of the collected documents and included keywords related to workers’ compensation (ie, company, occupational injury, insurance, accident, approval, and compensation) and keywords describing specific compensation benefits such as medical expense benefits, temporary incapacity benefits, and disability benefits. In the yearly trend, Theme 1 gradually decreased; however, other themes showed an overall increasing pattern. Certain queries (ie, musculoskeletal system, critical care, and foreign workers) showed no significant variation in the number of queries. CONCLUSIONS: We conducted LDA analysis of SNS data of occupational accident–related queries and discovered that the primary concerns of workers posting about occupational injuries and diseases were workers’ compensation benefits, fatal and non-fatal injuries, vulnerable workers, and illicit agreements with employers. While traditional systems focus mainly on quantitative monitoring of occupational accidents, qualitative aspects formulated by topic modeling from unstructured SNS queries may be valuable to address inequalities and improve occupational health and safety. JMIR Publications 2020-08-13 /pmc/articles/PMC7453332/ /pubmed/32663156 http://dx.doi.org/10.2196/19222 Text en ©Kyoung-Bok Min, Sung-Hee Song, Jin-Young Min. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 13.08.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Min, Kyoung-Bok
Song, Sung-Hee
Min, Jin-Young
Topic Modeling of Social Networking Service Data on Occupational Accidents in Korea: Latent Dirichlet Allocation Analysis
title Topic Modeling of Social Networking Service Data on Occupational Accidents in Korea: Latent Dirichlet Allocation Analysis
title_full Topic Modeling of Social Networking Service Data on Occupational Accidents in Korea: Latent Dirichlet Allocation Analysis
title_fullStr Topic Modeling of Social Networking Service Data on Occupational Accidents in Korea: Latent Dirichlet Allocation Analysis
title_full_unstemmed Topic Modeling of Social Networking Service Data on Occupational Accidents in Korea: Latent Dirichlet Allocation Analysis
title_short Topic Modeling of Social Networking Service Data on Occupational Accidents in Korea: Latent Dirichlet Allocation Analysis
title_sort topic modeling of social networking service data on occupational accidents in korea: latent dirichlet allocation analysis
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7453332/
https://www.ncbi.nlm.nih.gov/pubmed/32663156
http://dx.doi.org/10.2196/19222
work_keys_str_mv AT minkyoungbok topicmodelingofsocialnetworkingservicedataonoccupationalaccidentsinkorealatentdirichletallocationanalysis
AT songsunghee topicmodelingofsocialnetworkingservicedataonoccupationalaccidentsinkorealatentdirichletallocationanalysis
AT minjinyoung topicmodelingofsocialnetworkingservicedataonoccupationalaccidentsinkorealatentdirichletallocationanalysis