Cargando…
Social media based surveillance systems for healthcare using machine learning: A systematic review
BACKGROUND: Real-time surveillance in the field of health informatics has emerged as a growing domain of interest among worldwide researchers. Evolution in this field has helped in the introduction of various initiatives related to public health informatics. Surveillance systems in the area of healt...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier Inc.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7331523/ https://www.ncbi.nlm.nih.gov/pubmed/32622833 http://dx.doi.org/10.1016/j.jbi.2020.103500 |
_version_ | 1783553345342930944 |
---|---|
author | Gupta, Aakansha Katarya, Rahul |
author_facet | Gupta, Aakansha Katarya, Rahul |
author_sort | Gupta, Aakansha |
collection | PubMed |
description | BACKGROUND: Real-time surveillance in the field of health informatics has emerged as a growing domain of interest among worldwide researchers. Evolution in this field has helped in the introduction of various initiatives related to public health informatics. Surveillance systems in the area of health informatics utilizing social media information have been developed for early prediction of disease outbreaks and to monitor diseases. In the past few years, the availability of social media data, particularly Twitter data, enabled real-time syndromic surveillance that provides immediate analysis and instant feedback to those who are charged with follow-ups and investigation of potential outbreaks. In this paper, we review the recent work, trends, and machine learning(ML) text classification approaches used by surveillance systems seeking social media data in the healthcare domain. We also highlight the limitations and challenges followed by possible future directions that can be taken further in this domain. METHODS: To study the landscape of research in health informatics performing surveillance of the various health-related data posted on social media or web-based platforms, we present a bibliometric analysis of the 1240 publications indexed in multiple scientific databases (IEEE, ACM Digital Library, ScienceDirect, PubMed) from the year 2010–2018. The papers were further reviewed based on the various machine learning algorithms used for analyzing health-related text posted on social media platforms. FINDINGS: Based on the corpus of 148 selected articles, the study finds the types of social media or web-based platforms used for surveillance in the healthcare domain, along with the health topic(s) studied by them. In the corpus of selected articles, we found 26 articles were using machine learning technique. These articles were studied to find commonly used ML techniques. The majority of studies (24%) focused on the surveillance of flu or influenza-like illness (ILI). Twitter (64%) is the most popular data source to perform surveillance research using social media text data, and Support Vector Machine (SVM) (33%) being the most used ML algorithm for text classification. CONCLUSIONS: The inclusion of online data in surveillance systems has improved the disease prediction ability over traditional syndromic surveillance systems. However, social media based surveillance systems have many limitations and challenges, including noise, demographic bias, privacy issues, etc. Our paper mentions future directions, which can be useful for researchers working in the area. Researchers can use this paper as a library for social media based surveillance systems in the healthcare domain and can expand such systems by incorporating the future works discussed in our paper. |
format | Online Article Text |
id | pubmed-7331523 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Elsevier Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-73315232020-07-06 Social media based surveillance systems for healthcare using machine learning: A systematic review Gupta, Aakansha Katarya, Rahul J Biomed Inform Article BACKGROUND: Real-time surveillance in the field of health informatics has emerged as a growing domain of interest among worldwide researchers. Evolution in this field has helped in the introduction of various initiatives related to public health informatics. Surveillance systems in the area of health informatics utilizing social media information have been developed for early prediction of disease outbreaks and to monitor diseases. In the past few years, the availability of social media data, particularly Twitter data, enabled real-time syndromic surveillance that provides immediate analysis and instant feedback to those who are charged with follow-ups and investigation of potential outbreaks. In this paper, we review the recent work, trends, and machine learning(ML) text classification approaches used by surveillance systems seeking social media data in the healthcare domain. We also highlight the limitations and challenges followed by possible future directions that can be taken further in this domain. METHODS: To study the landscape of research in health informatics performing surveillance of the various health-related data posted on social media or web-based platforms, we present a bibliometric analysis of the 1240 publications indexed in multiple scientific databases (IEEE, ACM Digital Library, ScienceDirect, PubMed) from the year 2010–2018. The papers were further reviewed based on the various machine learning algorithms used for analyzing health-related text posted on social media platforms. FINDINGS: Based on the corpus of 148 selected articles, the study finds the types of social media or web-based platforms used for surveillance in the healthcare domain, along with the health topic(s) studied by them. In the corpus of selected articles, we found 26 articles were using machine learning technique. These articles were studied to find commonly used ML techniques. The majority of studies (24%) focused on the surveillance of flu or influenza-like illness (ILI). Twitter (64%) is the most popular data source to perform surveillance research using social media text data, and Support Vector Machine (SVM) (33%) being the most used ML algorithm for text classification. CONCLUSIONS: The inclusion of online data in surveillance systems has improved the disease prediction ability over traditional syndromic surveillance systems. However, social media based surveillance systems have many limitations and challenges, including noise, demographic bias, privacy issues, etc. Our paper mentions future directions, which can be useful for researchers working in the area. Researchers can use this paper as a library for social media based surveillance systems in the healthcare domain and can expand such systems by incorporating the future works discussed in our paper. Elsevier Inc. 2020-08 2020-07-02 /pmc/articles/PMC7331523/ /pubmed/32622833 http://dx.doi.org/10.1016/j.jbi.2020.103500 Text en © 2020 Elsevier Inc. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Gupta, Aakansha Katarya, Rahul Social media based surveillance systems for healthcare using machine learning: A systematic review |
title | Social media based surveillance systems for healthcare using machine learning: A systematic review |
title_full | Social media based surveillance systems for healthcare using machine learning: A systematic review |
title_fullStr | Social media based surveillance systems for healthcare using machine learning: A systematic review |
title_full_unstemmed | Social media based surveillance systems for healthcare using machine learning: A systematic review |
title_short | Social media based surveillance systems for healthcare using machine learning: A systematic review |
title_sort | social media based surveillance systems for healthcare using machine learning: a systematic review |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7331523/ https://www.ncbi.nlm.nih.gov/pubmed/32622833 http://dx.doi.org/10.1016/j.jbi.2020.103500 |
work_keys_str_mv | AT guptaaakansha socialmediabasedsurveillancesystemsforhealthcareusingmachinelearningasystematicreview AT kataryarahul socialmediabasedsurveillancesystemsforhealthcareusingmachinelearningasystematicreview |