Cargando…

Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study

BACKGROUND: It is difficult to synthesize the vast amount of textual data available from social media websites. Capturing real-world discussions via social media could provide insights into individuals’ opinions and the decision-making process. OBJECTIVE: We conducted a sequential mixed methods stud...

Descripción completa

Detalles Bibliográficos
Autores principales: Lyles, Courtney Rees, Godbehere, Andrew, Le, Gem, El Ghaoui, Laurent, Sarkar, Urmimala
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Gunther Eysenbach 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4920957/
https://www.ncbi.nlm.nih.gov/pubmed/27288093
http://dx.doi.org/10.2196/publichealth.5308
_version_ 1782439452458614784
author Lyles, Courtney Rees
Godbehere, Andrew
Le, Gem
El Ghaoui, Laurent
Sarkar, Urmimala
author_facet Lyles, Courtney Rees
Godbehere, Andrew
Le, Gem
El Ghaoui, Laurent
Sarkar, Urmimala
author_sort Lyles, Courtney Rees
collection PubMed
description BACKGROUND: It is difficult to synthesize the vast amount of textual data available from social media websites. Capturing real-world discussions via social media could provide insights into individuals’ opinions and the decision-making process. OBJECTIVE: We conducted a sequential mixed methods study to determine the utility of sparse machine learning techniques in summarizing Twitter dialogues. We chose a narrowly defined topic for this approach: cervical cancer discussions over a 6-month time period surrounding a change in Pap smear screening guidelines. METHODS: We applied statistical methodologies, known as sparse machine learning algorithms, to summarize Twitter messages about cervical cancer before and after the 2012 change in Pap smear screening guidelines by the US Preventive Services Task Force (USPSTF). All messages containing the search terms “cervical cancer,” “Pap smear,” and “Pap test” were analyzed during: (1) January 1–March 13, 2012, and (2) March 14–June 30, 2012. Topic modeling was used to discern the most common topics from each time period, and determine the singular value criterion for each topic. The results were then qualitatively coded from top 10 relevant topics to determine the efficiency of clustering method in grouping distinct ideas, and how the discussion differed before vs. after the change in guidelines . RESULTS: This machine learning method was effective in grouping the relevant discussion topics about cervical cancer during the respective time periods (~20% overall irrelevant content in both time periods). Qualitative analysis determined that a significant portion of the top discussion topics in the second time period directly reflected the USPSTF guideline change (eg, “New Screening Guidelines for Cervical Cancer”), and many topics in both time periods were addressing basic screening promotion and education (eg, “It is Cervical Cancer Awareness Month! Click the link to see where you can receive a free or low cost Pap test.”) CONCLUSIONS: It was demonstrated that machine learning tools can be useful in cervical cancer prevention and screening discussions on Twitter. This method allowed us to prove that there is publicly available significant information about cervical cancer screening on social media sites. Moreover, we observed a direct impact of the guideline change within the Twitter messages.
format Online
Article
Text
id pubmed-4920957
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Gunther Eysenbach
record_format MEDLINE/PubMed
spelling pubmed-49209572016-07-11 Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study Lyles, Courtney Rees Godbehere, Andrew Le, Gem El Ghaoui, Laurent Sarkar, Urmimala JMIR Public Health Surveill Original Paper BACKGROUND: It is difficult to synthesize the vast amount of textual data available from social media websites. Capturing real-world discussions via social media could provide insights into individuals’ opinions and the decision-making process. OBJECTIVE: We conducted a sequential mixed methods study to determine the utility of sparse machine learning techniques in summarizing Twitter dialogues. We chose a narrowly defined topic for this approach: cervical cancer discussions over a 6-month time period surrounding a change in Pap smear screening guidelines. METHODS: We applied statistical methodologies, known as sparse machine learning algorithms, to summarize Twitter messages about cervical cancer before and after the 2012 change in Pap smear screening guidelines by the US Preventive Services Task Force (USPSTF). All messages containing the search terms “cervical cancer,” “Pap smear,” and “Pap test” were analyzed during: (1) January 1–March 13, 2012, and (2) March 14–June 30, 2012. Topic modeling was used to discern the most common topics from each time period, and determine the singular value criterion for each topic. The results were then qualitatively coded from top 10 relevant topics to determine the efficiency of clustering method in grouping distinct ideas, and how the discussion differed before vs. after the change in guidelines . RESULTS: This machine learning method was effective in grouping the relevant discussion topics about cervical cancer during the respective time periods (~20% overall irrelevant content in both time periods). Qualitative analysis determined that a significant portion of the top discussion topics in the second time period directly reflected the USPSTF guideline change (eg, “New Screening Guidelines for Cervical Cancer”), and many topics in both time periods were addressing basic screening promotion and education (eg, “It is Cervical Cancer Awareness Month! Click the link to see where you can receive a free or low cost Pap test.”) CONCLUSIONS: It was demonstrated that machine learning tools can be useful in cervical cancer prevention and screening discussions on Twitter. This method allowed us to prove that there is publicly available significant information about cervical cancer screening on social media sites. Moreover, we observed a direct impact of the guideline change within the Twitter messages. Gunther Eysenbach 2016-06-10 /pmc/articles/PMC4920957/ /pubmed/27288093 http://dx.doi.org/10.2196/publichealth.5308 Text en ©Courtney Rees Lyles, Andrew Godbehere, Gem Le, Laurent El Ghaoui, Urmimala Sarkar. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 10.06.2016. https://creativecommons.org/licenses/by/2.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/ (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included.
spellingShingle Original Paper
Lyles, Courtney Rees
Godbehere, Andrew
Le, Gem
El Ghaoui, Laurent
Sarkar, Urmimala
Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study
title Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study
title_full Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study
title_fullStr Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study
title_full_unstemmed Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study
title_short Applying Sparse Machine Learning Methods to Twitter: Analysis of the 2012 Change in Pap Smear Guidelines. A Sequential Mixed-Methods Study
title_sort applying sparse machine learning methods to twitter: analysis of the 2012 change in pap smear guidelines. a sequential mixed-methods study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4920957/
https://www.ncbi.nlm.nih.gov/pubmed/27288093
http://dx.doi.org/10.2196/publichealth.5308
work_keys_str_mv AT lylescourtneyrees applyingsparsemachinelearningmethodstotwitteranalysisofthe2012changeinpapsmearguidelinesasequentialmixedmethodsstudy
AT godbehereandrew applyingsparsemachinelearningmethodstotwitteranalysisofthe2012changeinpapsmearguidelinesasequentialmixedmethodsstudy
AT legem applyingsparsemachinelearningmethodstotwitteranalysisofthe2012changeinpapsmearguidelinesasequentialmixedmethodsstudy
AT elghaouilaurent applyingsparsemachinelearningmethodstotwitteranalysisofthe2012changeinpapsmearguidelinesasequentialmixedmethodsstudy
AT sarkarurmimala applyingsparsemachinelearningmethodstotwitteranalysisofthe2012changeinpapsmearguidelinesasequentialmixedmethodsstudy