Cargando…

Text-based predictions of COVID-19 diagnosis from self-reported chemosensory descriptions

BACKGROUND: There is a prevailing view that humans’ capacity to use language to characterize sensations like odors or tastes is poor, providing an unreliable source of information. METHODS: Here, we developed a machine learning method based on Natural Language Processing (NLP) using Large Language M...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Hongyang, Gerkin, Richard C., Bakke, Alyssa, Norel, Raquel, Cecchi, Guillermo, Laudamiel, Christophe, Niv, Masha Y., Ohla, Kathrin, Hayes, John E., Parma, Valentina, Meyer, Pablo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10374642/
https://www.ncbi.nlm.nih.gov/pubmed/37500763
http://dx.doi.org/10.1038/s43856-023-00334-5
_version_ 1785078818437857280
author Li, Hongyang
Gerkin, Richard C.
Bakke, Alyssa
Norel, Raquel
Cecchi, Guillermo
Laudamiel, Christophe
Niv, Masha Y.
Ohla, Kathrin
Hayes, John E.
Parma, Valentina
Meyer, Pablo
author_facet Li, Hongyang
Gerkin, Richard C.
Bakke, Alyssa
Norel, Raquel
Cecchi, Guillermo
Laudamiel, Christophe
Niv, Masha Y.
Ohla, Kathrin
Hayes, John E.
Parma, Valentina
Meyer, Pablo
author_sort Li, Hongyang
collection PubMed
description BACKGROUND: There is a prevailing view that humans’ capacity to use language to characterize sensations like odors or tastes is poor, providing an unreliable source of information. METHODS: Here, we developed a machine learning method based on Natural Language Processing (NLP) using Large Language Models (LLM) to predict COVID-19 diagnosis solely based on text descriptions of acute changes in chemosensation, i.e., smell, taste and chemesthesis, caused by the disease. The dataset of more than 1500 subjects was obtained from survey responses early in the COVID-19 pandemic, in Spring 2020. RESULTS: When predicting COVID-19 diagnosis, our NLP model performs comparably (AUC ROC ~ 0.65) to models based on self-reported changes in function collected via quantitative rating scales. Further, our NLP model could attribute importance of words when performing the prediction; sentiment and descriptive words such as “smell”, “taste”, “sense”, had strong contributions to the predictions. In addition, adjectives describing specific tastes or smells such as “salty”, “sweet”, “spicy”, and “sour” also contributed considerably to predictions. CONCLUSIONS: Our results show that the description of perceptual symptoms caused by a viral infection can be used to fine-tune an LLM model to correctly predict and interpret the diagnostic status of a subject. In the future, similar models may have utility for patient verbatims from online health portals or electronic health records.
format Online
Article
Text
id pubmed-10374642
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-103746422023-07-29 Text-based predictions of COVID-19 diagnosis from self-reported chemosensory descriptions Li, Hongyang Gerkin, Richard C. Bakke, Alyssa Norel, Raquel Cecchi, Guillermo Laudamiel, Christophe Niv, Masha Y. Ohla, Kathrin Hayes, John E. Parma, Valentina Meyer, Pablo Commun Med (Lond) Article BACKGROUND: There is a prevailing view that humans’ capacity to use language to characterize sensations like odors or tastes is poor, providing an unreliable source of information. METHODS: Here, we developed a machine learning method based on Natural Language Processing (NLP) using Large Language Models (LLM) to predict COVID-19 diagnosis solely based on text descriptions of acute changes in chemosensation, i.e., smell, taste and chemesthesis, caused by the disease. The dataset of more than 1500 subjects was obtained from survey responses early in the COVID-19 pandemic, in Spring 2020. RESULTS: When predicting COVID-19 diagnosis, our NLP model performs comparably (AUC ROC ~ 0.65) to models based on self-reported changes in function collected via quantitative rating scales. Further, our NLP model could attribute importance of words when performing the prediction; sentiment and descriptive words such as “smell”, “taste”, “sense”, had strong contributions to the predictions. In addition, adjectives describing specific tastes or smells such as “salty”, “sweet”, “spicy”, and “sour” also contributed considerably to predictions. CONCLUSIONS: Our results show that the description of perceptual symptoms caused by a viral infection can be used to fine-tune an LLM model to correctly predict and interpret the diagnostic status of a subject. In the future, similar models may have utility for patient verbatims from online health portals or electronic health records. Nature Publishing Group UK 2023-07-27 /pmc/articles/PMC10374642/ /pubmed/37500763 http://dx.doi.org/10.1038/s43856-023-00334-5 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Li, Hongyang
Gerkin, Richard C.
Bakke, Alyssa
Norel, Raquel
Cecchi, Guillermo
Laudamiel, Christophe
Niv, Masha Y.
Ohla, Kathrin
Hayes, John E.
Parma, Valentina
Meyer, Pablo
Text-based predictions of COVID-19 diagnosis from self-reported chemosensory descriptions
title Text-based predictions of COVID-19 diagnosis from self-reported chemosensory descriptions
title_full Text-based predictions of COVID-19 diagnosis from self-reported chemosensory descriptions
title_fullStr Text-based predictions of COVID-19 diagnosis from self-reported chemosensory descriptions
title_full_unstemmed Text-based predictions of COVID-19 diagnosis from self-reported chemosensory descriptions
title_short Text-based predictions of COVID-19 diagnosis from self-reported chemosensory descriptions
title_sort text-based predictions of covid-19 diagnosis from self-reported chemosensory descriptions
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10374642/
https://www.ncbi.nlm.nih.gov/pubmed/37500763
http://dx.doi.org/10.1038/s43856-023-00334-5
work_keys_str_mv AT lihongyang textbasedpredictionsofcovid19diagnosisfromselfreportedchemosensorydescriptions
AT gerkinrichardc textbasedpredictionsofcovid19diagnosisfromselfreportedchemosensorydescriptions
AT bakkealyssa textbasedpredictionsofcovid19diagnosisfromselfreportedchemosensorydescriptions
AT norelraquel textbasedpredictionsofcovid19diagnosisfromselfreportedchemosensorydescriptions
AT cecchiguillermo textbasedpredictionsofcovid19diagnosisfromselfreportedchemosensorydescriptions
AT laudamielchristophe textbasedpredictionsofcovid19diagnosisfromselfreportedchemosensorydescriptions
AT nivmashay textbasedpredictionsofcovid19diagnosisfromselfreportedchemosensorydescriptions
AT ohlakathrin textbasedpredictionsofcovid19diagnosisfromselfreportedchemosensorydescriptions
AT hayesjohne textbasedpredictionsofcovid19diagnosisfromselfreportedchemosensorydescriptions
AT parmavalentina textbasedpredictionsofcovid19diagnosisfromselfreportedchemosensorydescriptions
AT meyerpablo textbasedpredictionsofcovid19diagnosisfromselfreportedchemosensorydescriptions