Cargando…
Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study
BACKGROUND: It is difficult to synthesize the vast amount of textual data available from social media websites. Capturing real-world discussions via social media could provide insights into individuals’ opinions and the decision-making process. OBJECTIVE: We conducted a sequential mixed methods stud...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Gunther Eysenbach
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4920957/ https://www.ncbi.nlm.nih.gov/pubmed/27288093 http://dx.doi.org/10.2196/publichealth.5308 |
_version_ | 1782439452458614784 |
---|---|
author | Lyles, Courtney Rees Godbehere, Andrew Le, Gem El Ghaoui, Laurent Sarkar, Urmimala |
author_facet | Lyles, Courtney Rees Godbehere, Andrew Le, Gem El Ghaoui, Laurent Sarkar, Urmimala |
author_sort | Lyles, Courtney Rees |
collection | PubMed |
description | BACKGROUND: It is difficult to synthesize the vast amount of textual data available from social media websites. Capturing real-world discussions via social media could provide insights into individuals’ opinions and the decision-making process. OBJECTIVE: We conducted a sequential mixed methods study to determine the utility of sparse machine learning techniques in summarizing Twitter dialogues. We chose a narrowly defined topic for this approach: cervical cancer discussions over a 6-month time period surrounding a change in Pap smear screening guidelines. METHODS: We applied statistical methodologies, known as sparse machine learning algorithms, to summarize Twitter messages about cervical cancer before and after the 2012 change in Pap smear screening guidelines by the US Preventive Services Task Force (USPSTF). All messages containing the search terms “cervical cancer,” “Pap smear,” and “Pap test” were analyzed during: (1) January 1–March 13, 2012, and (2) March 14–June 30, 2012. Topic modeling was used to discern the most common topics from each time period, and determine the singular value criterion for each topic. The results were then qualitatively coded from top 10 relevant topics to determine the efficiency of clustering method in grouping distinct ideas, and how the discussion differed before vs. after the change in guidelines . RESULTS: This machine learning method was effective in grouping the relevant discussion topics about cervical cancer during the respective time periods (~20% overall irrelevant content in both time periods). Qualitative analysis determined that a significant portion of the top discussion topics in the second time period directly reflected the USPSTF guideline change (eg, “New Screening Guidelines for Cervical Cancer”), and many topics in both time periods were addressing basic screening promotion and education (eg, “It is Cervical Cancer Awareness Month! Click the link to see where you can receive a free or low cost Pap test.”) CONCLUSIONS: It was demonstrated that machine learning tools can be useful in cervical cancer prevention and screening discussions on Twitter. This method allowed us to prove that there is publicly available significant information about cervical cancer screening on social media sites. Moreover, we observed a direct impact of the guideline change within the Twitter messages. |
format | Online Article Text |
id | pubmed-4920957 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Gunther Eysenbach |
record_format | MEDLINE/PubMed |
spelling | pubmed-49209572016-07-11 Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study Lyles, Courtney Rees Godbehere, Andrew Le, Gem El Ghaoui, Laurent Sarkar, Urmimala JMIR Public Health Surveill Original Paper BACKGROUND: It is difficult to synthesize the vast amount of textual data available from social media websites. Capturing real-world discussions via social media could provide insights into individuals’ opinions and the decision-making process. OBJECTIVE: We conducted a sequential mixed methods study to determine the utility of sparse machine learning techniques in summarizing Twitter dialogues. We chose a narrowly defined topic for this approach: cervical cancer discussions over a 6-month time period surrounding a change in Pap smear screening guidelines. METHODS: We applied statistical methodologies, known as sparse machine learning algorithms, to summarize Twitter messages about cervical cancer before and after the 2012 change in Pap smear screening guidelines by the US Preventive Services Task Force (USPSTF). All messages containing the search terms “cervical cancer,” “Pap smear,” and “Pap test” were analyzed during: (1) January 1–March 13, 2012, and (2) March 14–June 30, 2012. Topic modeling was used to discern the most common topics from each time period, and determine the singular value criterion for each topic. The results were then qualitatively coded from top 10 relevant topics to determine the efficiency of clustering method in grouping distinct ideas, and how the discussion differed before vs. after the change in guidelines . RESULTS: This machine learning method was effective in grouping the relevant discussion topics about cervical cancer during the respective time periods (~20% overall irrelevant content in both time periods). Qualitative analysis determined that a significant portion of the top discussion topics in the second time period directly reflected the USPSTF guideline change (eg, “New Screening Guidelines for Cervical Cancer”), and many topics in both time periods were addressing basic screening promotion and education (eg, “It is Cervical Cancer Awareness Month! Click the link to see where you can receive a free or low cost Pap test.”) CONCLUSIONS: It was demonstrated that machine learning tools can be useful in cervical cancer prevention and screening discussions on Twitter. This method allowed us to prove that there is publicly available significant information about cervical cancer screening on social media sites. Moreover, we observed a direct impact of the guideline change within the Twitter messages. Gunther Eysenbach 2016-06-10 /pmc/articles/PMC4920957/ /pubmed/27288093 http://dx.doi.org/10.2196/publichealth.5308 Text en ©Courtney Rees Lyles, Andrew Godbehere, Gem Le, Laurent El Ghaoui, Urmimala Sarkar. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 10.06.2016. https://creativecommons.org/licenses/by/2.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/ (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Lyles, Courtney Rees Godbehere, Andrew Le, Gem El Ghaoui, Laurent Sarkar, Urmimala Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study |
title | Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study |
title_full | Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study |
title_fullStr | Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study |
title_full_unstemmed | Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study |
title_short | Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study |
title_sort | applying sparse machine learning methods to twitter: analysis of the 2012 change in pap smear guidelines. a sequential mixed-methods study |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4920957/ https://www.ncbi.nlm.nih.gov/pubmed/27288093 http://dx.doi.org/10.2196/publichealth.5308 |
work_keys_str_mv | AT lylescourtneyrees applyingsparsemachinelearningmethodstotwitteranalysisofthe2012changeinpapsmearguidelinesasequentialmixedmethodsstudy AT godbehereandrew applyingsparsemachinelearningmethodstotwitteranalysisofthe2012changeinpapsmearguidelinesasequentialmixedmethodsstudy AT legem applyingsparsemachinelearningmethodstotwitteranalysisofthe2012changeinpapsmearguidelinesasequentialmixedmethodsstudy AT elghaouilaurent applyingsparsemachinelearningmethodstotwitteranalysisofthe2012changeinpapsmearguidelinesasequentialmixedmethodsstudy AT sarkarurmimala applyingsparsemachinelearningmethodstotwitteranalysisofthe2012changeinpapsmearguidelinesasequentialmixedmethodsstudy |