Cargando…
Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration
Through an adequate survey of the history of the disease, Narrative Medicine (NM) aims to allow the definition and implementation of an effective, appropriate, and shared treatment path. In the present study different topic modeling techniques are compared, as Latent Dirichlet Allocation (LDA) and t...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9496775/ https://www.ncbi.nlm.nih.gov/pubmed/36134915 http://dx.doi.org/10.3390/biotech11030041 |
_version_ | 1784794352343580672 |
---|---|
author | Scarpino, Ileana Zucco, Chiara Vallelunga, Rosarina Luzza, Francesco Cannataro, Mario |
author_facet | Scarpino, Ileana Zucco, Chiara Vallelunga, Rosarina Luzza, Francesco Cannataro, Mario |
author_sort | Scarpino, Ileana |
collection | PubMed |
description | Through an adequate survey of the history of the disease, Narrative Medicine (NM) aims to allow the definition and implementation of an effective, appropriate, and shared treatment path. In the present study different topic modeling techniques are compared, as Latent Dirichlet Allocation (LDA) and topic modeling based on BERT transformer, to extract meaningful insights in the Italian narration of COVID-19 pandemic. In particular, the main focus was the characterization of Post-acute Sequelae of COVID-19, (i.e., PASC) writings as opposed to writings by health professionals and general reflections on COVID-19, (i.e., non-PASC) writings, modeled as a semi-supervised task. The results show that the BERTopic-based approach outperforms the LDA-base approach by grouping in the same cluster the 97.26% of analyzed documents, and reaching an overall accuracy of 91.97%. |
format | Online Article Text |
id | pubmed-9496775 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-94967752022-09-23 Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration Scarpino, Ileana Zucco, Chiara Vallelunga, Rosarina Luzza, Francesco Cannataro, Mario BioTech (Basel) Article Through an adequate survey of the history of the disease, Narrative Medicine (NM) aims to allow the definition and implementation of an effective, appropriate, and shared treatment path. In the present study different topic modeling techniques are compared, as Latent Dirichlet Allocation (LDA) and topic modeling based on BERT transformer, to extract meaningful insights in the Italian narration of COVID-19 pandemic. In particular, the main focus was the characterization of Post-acute Sequelae of COVID-19, (i.e., PASC) writings as opposed to writings by health professionals and general reflections on COVID-19, (i.e., non-PASC) writings, modeled as a semi-supervised task. The results show that the BERTopic-based approach outperforms the LDA-base approach by grouping in the same cluster the 97.26% of analyzed documents, and reaching an overall accuracy of 91.97%. MDPI 2022-09-03 /pmc/articles/PMC9496775/ /pubmed/36134915 http://dx.doi.org/10.3390/biotech11030041 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Scarpino, Ileana Zucco, Chiara Vallelunga, Rosarina Luzza, Francesco Cannataro, Mario Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration |
title | Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration |
title_full | Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration |
title_fullStr | Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration |
title_full_unstemmed | Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration |
title_short | Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration |
title_sort | investigating topic modeling techniques to extract meaningful insights in italian long covid narration |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9496775/ https://www.ncbi.nlm.nih.gov/pubmed/36134915 http://dx.doi.org/10.3390/biotech11030041 |
work_keys_str_mv | AT scarpinoileana investigatingtopicmodelingtechniquestoextractmeaningfulinsightsinitalianlongcovidnarration AT zuccochiara investigatingtopicmodelingtechniquestoextractmeaningfulinsightsinitalianlongcovidnarration AT vallelungarosarina investigatingtopicmodelingtechniquestoextractmeaningfulinsightsinitalianlongcovidnarration AT luzzafrancesco investigatingtopicmodelingtechniquestoextractmeaningfulinsightsinitalianlongcovidnarration AT cannataromario investigatingtopicmodelingtechniquestoextractmeaningfulinsightsinitalianlongcovidnarration |