Cargando…

Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration

Through an adequate survey of the history of the disease, Narrative Medicine (NM) aims to allow the definition and implementation of an effective, appropriate, and shared treatment path. In the present study different topic modeling techniques are compared, as Latent Dirichlet Allocation (LDA) and t...

Descripción completa

Detalles Bibliográficos
Autores principales: Scarpino, Ileana, Zucco, Chiara, Vallelunga, Rosarina, Luzza, Francesco, Cannataro, Mario
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9496775/
https://www.ncbi.nlm.nih.gov/pubmed/36134915
http://dx.doi.org/10.3390/biotech11030041
_version_ 1784794352343580672
author Scarpino, Ileana
Zucco, Chiara
Vallelunga, Rosarina
Luzza, Francesco
Cannataro, Mario
author_facet Scarpino, Ileana
Zucco, Chiara
Vallelunga, Rosarina
Luzza, Francesco
Cannataro, Mario
author_sort Scarpino, Ileana
collection PubMed
description Through an adequate survey of the history of the disease, Narrative Medicine (NM) aims to allow the definition and implementation of an effective, appropriate, and shared treatment path. In the present study different topic modeling techniques are compared, as Latent Dirichlet Allocation (LDA) and topic modeling based on BERT transformer, to extract meaningful insights in the Italian narration of COVID-19 pandemic. In particular, the main focus was the characterization of Post-acute Sequelae of COVID-19, (i.e., PASC) writings as opposed to writings by health professionals and general reflections on COVID-19, (i.e., non-PASC) writings, modeled as a semi-supervised task. The results show that the BERTopic-based approach outperforms the LDA-base approach by grouping in the same cluster the 97.26% of analyzed documents, and reaching an overall accuracy of 91.97%.
format Online
Article
Text
id pubmed-9496775
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-94967752022-09-23 Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration Scarpino, Ileana Zucco, Chiara Vallelunga, Rosarina Luzza, Francesco Cannataro, Mario BioTech (Basel) Article Through an adequate survey of the history of the disease, Narrative Medicine (NM) aims to allow the definition and implementation of an effective, appropriate, and shared treatment path. In the present study different topic modeling techniques are compared, as Latent Dirichlet Allocation (LDA) and topic modeling based on BERT transformer, to extract meaningful insights in the Italian narration of COVID-19 pandemic. In particular, the main focus was the characterization of Post-acute Sequelae of COVID-19, (i.e., PASC) writings as opposed to writings by health professionals and general reflections on COVID-19, (i.e., non-PASC) writings, modeled as a semi-supervised task. The results show that the BERTopic-based approach outperforms the LDA-base approach by grouping in the same cluster the 97.26% of analyzed documents, and reaching an overall accuracy of 91.97%. MDPI 2022-09-03 /pmc/articles/PMC9496775/ /pubmed/36134915 http://dx.doi.org/10.3390/biotech11030041 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Scarpino, Ileana
Zucco, Chiara
Vallelunga, Rosarina
Luzza, Francesco
Cannataro, Mario
Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration
title Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration
title_full Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration
title_fullStr Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration
title_full_unstemmed Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration
title_short Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration
title_sort investigating topic modeling techniques to extract meaningful insights in italian long covid narration
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9496775/
https://www.ncbi.nlm.nih.gov/pubmed/36134915
http://dx.doi.org/10.3390/biotech11030041
work_keys_str_mv AT scarpinoileana investigatingtopicmodelingtechniquestoextractmeaningfulinsightsinitalianlongcovidnarration
AT zuccochiara investigatingtopicmodelingtechniquestoextractmeaningfulinsightsinitalianlongcovidnarration
AT vallelungarosarina investigatingtopicmodelingtechniquestoextractmeaningfulinsightsinitalianlongcovidnarration
AT luzzafrancesco investigatingtopicmodelingtechniquestoextractmeaningfulinsightsinitalianlongcovidnarration
AT cannataromario investigatingtopicmodelingtechniquestoextractmeaningfulinsightsinitalianlongcovidnarration