Cargando…

Sentiment Analysis of Health Care Tweets: Review of the Methods Used

BACKGROUND: Twitter is a microblogging service where users can send and read short 140-character messages called “tweets.” There are several unstructured, free-text tweets relating to health care being shared on Twitter, which is becoming a popular area for health care research. Sentiment is a metri...

Descripción completa

Detalles Bibliográficos
Autores principales: Gohil, Sunir, Vuik, Sabine, Darzi, Ara
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5938573/
https://www.ncbi.nlm.nih.gov/pubmed/29685871
http://dx.doi.org/10.2196/publichealth.5789
_version_ 1783320808805892096
author Gohil, Sunir
Vuik, Sabine
Darzi, Ara
author_facet Gohil, Sunir
Vuik, Sabine
Darzi, Ara
author_sort Gohil, Sunir
collection PubMed
description BACKGROUND: Twitter is a microblogging service where users can send and read short 140-character messages called “tweets.” There are several unstructured, free-text tweets relating to health care being shared on Twitter, which is becoming a popular area for health care research. Sentiment is a metric commonly used to investigate the positive or negative opinion within these messages. Exploring the methods used for sentiment analysis in Twitter health care research may allow us to better understand the options available for future research in this growing field. OBJECTIVE: The first objective of this study was to understand which tools would be available for sentiment analysis of Twitter health care research, by reviewing existing studies in this area and the methods they used. The second objective was to determine which method would work best in the health care settings, by analyzing how the methods were used to answer specific health care questions, their production, and how their accuracy was analyzed. METHODS: A review of the literature was conducted pertaining to Twitter and health care research, which used a quantitative method of sentiment analysis for the free-text messages (tweets). The study compared the types of tools used in each case and examined methods for tool production, tool training, and analysis of accuracy. RESULTS: A total of 12 papers studying the quantitative measurement of sentiment in the health care setting were found. More than half of these studies produced tools specifically for their research, 4 used open source tools available freely, and 2 used commercially available software. Moreover, 4 out of the 12 tools were trained using a smaller sample of the study’s final data. The sentiment method was trained against, on an average, 0.45% (2816/627,024) of the total sample data. One of the 12 papers commented on the analysis of accuracy of the tool used. CONCLUSIONS: Multiple methods are used for sentiment analysis of tweets in the health care setting. These range from self-produced basic categorizations to more complex and expensive commercial software. The open source and commercial methods are developed on product reviews and generic social media messages. None of these methods have been extensively tested against a corpus of health care messages to check their accuracy. This study suggests that there is a need for an accurate and tested tool for sentiment analysis of tweets trained using a health care setting–specific corpus of manually annotated tweets first.
format Online
Article
Text
id pubmed-5938573
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-59385732018-05-09 Sentiment Analysis of Health Care Tweets: Review of the Methods Used Gohil, Sunir Vuik, Sabine Darzi, Ara JMIR Public Health Surveill Review BACKGROUND: Twitter is a microblogging service where users can send and read short 140-character messages called “tweets.” There are several unstructured, free-text tweets relating to health care being shared on Twitter, which is becoming a popular area for health care research. Sentiment is a metric commonly used to investigate the positive or negative opinion within these messages. Exploring the methods used for sentiment analysis in Twitter health care research may allow us to better understand the options available for future research in this growing field. OBJECTIVE: The first objective of this study was to understand which tools would be available for sentiment analysis of Twitter health care research, by reviewing existing studies in this area and the methods they used. The second objective was to determine which method would work best in the health care settings, by analyzing how the methods were used to answer specific health care questions, their production, and how their accuracy was analyzed. METHODS: A review of the literature was conducted pertaining to Twitter and health care research, which used a quantitative method of sentiment analysis for the free-text messages (tweets). The study compared the types of tools used in each case and examined methods for tool production, tool training, and analysis of accuracy. RESULTS: A total of 12 papers studying the quantitative measurement of sentiment in the health care setting were found. More than half of these studies produced tools specifically for their research, 4 used open source tools available freely, and 2 used commercially available software. Moreover, 4 out of the 12 tools were trained using a smaller sample of the study’s final data. The sentiment method was trained against, on an average, 0.45% (2816/627,024) of the total sample data. One of the 12 papers commented on the analysis of accuracy of the tool used. CONCLUSIONS: Multiple methods are used for sentiment analysis of tweets in the health care setting. These range from self-produced basic categorizations to more complex and expensive commercial software. The open source and commercial methods are developed on product reviews and generic social media messages. None of these methods have been extensively tested against a corpus of health care messages to check their accuracy. This study suggests that there is a need for an accurate and tested tool for sentiment analysis of tweets trained using a health care setting–specific corpus of manually annotated tweets first. JMIR Publications 2018-04-23 /pmc/articles/PMC5938573/ /pubmed/29685871 http://dx.doi.org/10.2196/publichealth.5789 Text en ©Sunir Gohil, Sabine Vuik, Ara Darzi. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 23.04.2018. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included.
spellingShingle Review
Gohil, Sunir
Vuik, Sabine
Darzi, Ara
Sentiment Analysis of Health Care Tweets: Review of the Methods Used
title Sentiment Analysis of Health Care Tweets: Review of the Methods Used
title_full Sentiment Analysis of Health Care Tweets: Review of the Methods Used
title_fullStr Sentiment Analysis of Health Care Tweets: Review of the Methods Used
title_full_unstemmed Sentiment Analysis of Health Care Tweets: Review of the Methods Used
title_short Sentiment Analysis of Health Care Tweets: Review of the Methods Used
title_sort sentiment analysis of health care tweets: review of the methods used
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5938573/
https://www.ncbi.nlm.nih.gov/pubmed/29685871
http://dx.doi.org/10.2196/publichealth.5789
work_keys_str_mv AT gohilsunir sentimentanalysisofhealthcaretweetsreviewofthemethodsused
AT vuiksabine sentimentanalysisofhealthcaretweetsreviewofthemethodsused
AT darziara sentimentanalysisofhealthcaretweetsreviewofthemethodsused