Cargando…
What Patients Can Tell Us: Topic Analysis for Social Media on Breast Cancer
BACKGROUND: Social media dedicated to health are increasingly used by patients and health professionals. They are rich textual resources with content generated through free exchange between patients. We are proposing a method to tackle the problem of retrieving clinically relevant information from s...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5556259/ https://www.ncbi.nlm.nih.gov/pubmed/28760725 http://dx.doi.org/10.2196/medinform.7779 |
_version_ | 1783257036779159552 |
---|---|
author | Tapi Nzali, Mike Donald Bringay, Sandra Lavergne, Christian Mollevi, Caroline Opitz, Thomas |
author_facet | Tapi Nzali, Mike Donald Bringay, Sandra Lavergne, Christian Mollevi, Caroline Opitz, Thomas |
author_sort | Tapi Nzali, Mike Donald |
collection | PubMed |
description | BACKGROUND: Social media dedicated to health are increasingly used by patients and health professionals. They are rich textual resources with content generated through free exchange between patients. We are proposing a method to tackle the problem of retrieving clinically relevant information from such social media in order to analyze the quality of life of patients with breast cancer. OBJECTIVE: Our aim was to detect the different topics discussed by patients on social media and to relate them to functional and symptomatic dimensions assessed in the internationally standardized self-administered questionnaires used in cancer clinical trials (European Organization for Research and Treatment of Cancer [EORTC] Quality of Life Questionnaire Core 30 [QLQ-C30] and breast cancer module [QLQ-BR23]). METHODS: First, we applied a classic text mining technique, latent Dirichlet allocation (LDA), to detect the different topics discussed on social media dealing with breast cancer. We applied the LDA model to 2 datasets composed of messages extracted from public Facebook groups and from a public health forum (cancerdusein.org, a French breast cancer forum) with relevant preprocessing. Second, we applied a customized Jaccard coefficient to automatically compute similarity distance between the topics detected with LDA and the questions in the self-administered questionnaires used to study quality of life. RESULTS: Among the 23 topics present in the self-administered questionnaires, 22 matched with the topics discussed by patients on social media. Interestingly, these topics corresponded to 95% (22/23) of the forum and 86% (20/23) of the Facebook group topics. These figures underline that topics related to quality of life are an important concern for patients. However, 5 social media topics had no corresponding topic in the questionnaires, which do not cover all of the patients’ concerns. Of these 5 topics, 2 could potentially be used in the questionnaires, and these 2 topics corresponded to a total of 3.10% (523/16,868) of topics in the cancerdusein.org corpus and 4.30% (3014/70,092) of the Facebook corpus. CONCLUSIONS: We found a good correspondence between detected topics on social media and topics covered by the self-administered questionnaires, which substantiates the sound construction of such questionnaires. We detected new emerging topics from social media that can be used to complete current self-administered questionnaires. Moreover, we confirmed that social media mining is an important source of information for complementary analysis of quality of life. |
format | Online Article Text |
id | pubmed-5556259 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-55562592017-08-29 What Patients Can Tell Us: Topic Analysis for Social Media on Breast Cancer Tapi Nzali, Mike Donald Bringay, Sandra Lavergne, Christian Mollevi, Caroline Opitz, Thomas JMIR Med Inform Original Paper BACKGROUND: Social media dedicated to health are increasingly used by patients and health professionals. They are rich textual resources with content generated through free exchange between patients. We are proposing a method to tackle the problem of retrieving clinically relevant information from such social media in order to analyze the quality of life of patients with breast cancer. OBJECTIVE: Our aim was to detect the different topics discussed by patients on social media and to relate them to functional and symptomatic dimensions assessed in the internationally standardized self-administered questionnaires used in cancer clinical trials (European Organization for Research and Treatment of Cancer [EORTC] Quality of Life Questionnaire Core 30 [QLQ-C30] and breast cancer module [QLQ-BR23]). METHODS: First, we applied a classic text mining technique, latent Dirichlet allocation (LDA), to detect the different topics discussed on social media dealing with breast cancer. We applied the LDA model to 2 datasets composed of messages extracted from public Facebook groups and from a public health forum (cancerdusein.org, a French breast cancer forum) with relevant preprocessing. Second, we applied a customized Jaccard coefficient to automatically compute similarity distance between the topics detected with LDA and the questions in the self-administered questionnaires used to study quality of life. RESULTS: Among the 23 topics present in the self-administered questionnaires, 22 matched with the topics discussed by patients on social media. Interestingly, these topics corresponded to 95% (22/23) of the forum and 86% (20/23) of the Facebook group topics. These figures underline that topics related to quality of life are an important concern for patients. However, 5 social media topics had no corresponding topic in the questionnaires, which do not cover all of the patients’ concerns. Of these 5 topics, 2 could potentially be used in the questionnaires, and these 2 topics corresponded to a total of 3.10% (523/16,868) of topics in the cancerdusein.org corpus and 4.30% (3014/70,092) of the Facebook corpus. CONCLUSIONS: We found a good correspondence between detected topics on social media and topics covered by the self-administered questionnaires, which substantiates the sound construction of such questionnaires. We detected new emerging topics from social media that can be used to complete current self-administered questionnaires. Moreover, we confirmed that social media mining is an important source of information for complementary analysis of quality of life. JMIR Publications 2017-07-31 /pmc/articles/PMC5556259/ /pubmed/28760725 http://dx.doi.org/10.2196/medinform.7779 Text en ©Mike Donald Tapi Nzali, Sandra Bringay, Christian Lavergne, Caroline Mollevi, Thomas Opitz. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 31.07.2017. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Tapi Nzali, Mike Donald Bringay, Sandra Lavergne, Christian Mollevi, Caroline Opitz, Thomas What Patients Can Tell Us: Topic Analysis for Social Media on Breast Cancer |
title | What Patients Can Tell Us: Topic Analysis for Social Media on Breast Cancer |
title_full | What Patients Can Tell Us: Topic Analysis for Social Media on Breast Cancer |
title_fullStr | What Patients Can Tell Us: Topic Analysis for Social Media on Breast Cancer |
title_full_unstemmed | What Patients Can Tell Us: Topic Analysis for Social Media on Breast Cancer |
title_short | What Patients Can Tell Us: Topic Analysis for Social Media on Breast Cancer |
title_sort | what patients can tell us: topic analysis for social media on breast cancer |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5556259/ https://www.ncbi.nlm.nih.gov/pubmed/28760725 http://dx.doi.org/10.2196/medinform.7779 |
work_keys_str_mv | AT tapinzalimikedonald whatpatientscantellustopicanalysisforsocialmediaonbreastcancer AT bringaysandra whatpatientscantellustopicanalysisforsocialmediaonbreastcancer AT lavergnechristian whatpatientscantellustopicanalysisforsocialmediaonbreastcancer AT mollevicaroline whatpatientscantellustopicanalysisforsocialmediaonbreastcancer AT opitzthomas whatpatientscantellustopicanalysisforsocialmediaonbreastcancer |