Cargando…

Identifying health related occupations of Twitter users through word embedding and deep neural networks

BACKGROUND: Twitter is a popular social networking site where short messages or “tweets” of users have been used extensively for research purposes. However, not much research has been done in mining the medical professions, such as detecting the occupations of users from their biographical contents....

Descripción completa

Detalles Bibliográficos
Autores principales:	Zainab, Kazi, Srivastava, Gautam, Mago, Vijay
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2022
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9520792/ https://www.ncbi.nlm.nih.gov/pubmed/36171569 http://dx.doi.org/10.1186/s12859-022-04933-2

_version_	1784799705295749120
author	Zainab, Kazi Srivastava, Gautam Mago, Vijay
author_facet	Zainab, Kazi Srivastava, Gautam Mago, Vijay
author_sort	Zainab, Kazi
collection	PubMed
description	BACKGROUND: Twitter is a popular social networking site where short messages or “tweets” of users have been used extensively for research purposes. However, not much research has been done in mining the medical professions, such as detecting the occupations of users from their biographical contents. Mining such professions can be used to build efficient recommender systems for cost-effective targeted advertisements. Moreover, it is highly important to develop effective methods to identify the occupation of users since conventional classification methods rely on features developed by human intelligence. Although, the result may be favorable for the classification problem. However, it is still extremely challenging for traditional classifiers to predict the medical occupations accurately since it involves predicting multiple occupations. Hence this study emphasizes predicting the medical occupational class of users through their public biographical (“Bio”) content. We have conducted our analysis by annotating the bio content of Twitter users. In this paper, we propose a method of combining word embedding with state-of-art neural network models that include: Long Short Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Unit, Bidirectional Encoder Representations from Transformers, and A lite BERT. Moreover, we have also observed that by composing the word embedding with the neural network models there is no need to construct any particular attribute or feature. By using word embedding, the bio contents are formatted as dense vectors which are fed as input into the neural network models as a sequence of vectors. RESULT: Performance metrics that include accuracy, precision, recall, and F1-score have shown a significant difference between our method of combining word embedding with neural network models than with the traditional methods. The scores have proved that our proposed approach has outperformed the traditional machine learning techniques for detecting medical occupations among users. ALBERT has performed the best among the deep learning networks with an F1 score of 0.90. CONCLUSION: In this study, we have presented a novel method of detecting the occupations of Twitter users engaged in the medical domain by merging word embedding with state-of-art neural networks. The outcomes of our approach have demonstrated that our method can further advance the process of analyzing corpora of social media without going through the trouble of developing computationally expensive features.
format	Online Article Text
id	pubmed-9520792
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-95207922022-09-30 Identifying health related occupations of Twitter users through word embedding and deep neural networks Zainab, Kazi Srivastava, Gautam Mago, Vijay BMC Bioinformatics Research BACKGROUND: Twitter is a popular social networking site where short messages or “tweets” of users have been used extensively for research purposes. However, not much research has been done in mining the medical professions, such as detecting the occupations of users from their biographical contents. Mining such professions can be used to build efficient recommender systems for cost-effective targeted advertisements. Moreover, it is highly important to develop effective methods to identify the occupation of users since conventional classification methods rely on features developed by human intelligence. Although, the result may be favorable for the classification problem. However, it is still extremely challenging for traditional classifiers to predict the medical occupations accurately since it involves predicting multiple occupations. Hence this study emphasizes predicting the medical occupational class of users through their public biographical (“Bio”) content. We have conducted our analysis by annotating the bio content of Twitter users. In this paper, we propose a method of combining word embedding with state-of-art neural network models that include: Long Short Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Unit, Bidirectional Encoder Representations from Transformers, and A lite BERT. Moreover, we have also observed that by composing the word embedding with the neural network models there is no need to construct any particular attribute or feature. By using word embedding, the bio contents are formatted as dense vectors which are fed as input into the neural network models as a sequence of vectors. RESULT: Performance metrics that include accuracy, precision, recall, and F1-score have shown a significant difference between our method of combining word embedding with neural network models than with the traditional methods. The scores have proved that our proposed approach has outperformed the traditional machine learning techniques for detecting medical occupations among users. ALBERT has performed the best among the deep learning networks with an F1 score of 0.90. CONCLUSION: In this study, we have presented a novel method of detecting the occupations of Twitter users engaged in the medical domain by merging word embedding with state-of-art neural networks. The outcomes of our approach have demonstrated that our method can further advance the process of analyzing corpora of social media without going through the trouble of developing computationally expensive features. BioMed Central 2022-09-28 /pmc/articles/PMC9520792/ /pubmed/36171569 http://dx.doi.org/10.1186/s12859-022-04933-2 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Zainab, Kazi Srivastava, Gautam Mago, Vijay Identifying health related occupations of Twitter users through word embedding and deep neural networks
title	Identifying health related occupations of Twitter users through word embedding and deep neural networks
title_full	Identifying health related occupations of Twitter users through word embedding and deep neural networks
title_fullStr	Identifying health related occupations of Twitter users through word embedding and deep neural networks
title_full_unstemmed	Identifying health related occupations of Twitter users through word embedding and deep neural networks
title_short	Identifying health related occupations of Twitter users through word embedding and deep neural networks
title_sort	identifying health related occupations of twitter users through word embedding and deep neural networks
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9520792/ https://www.ncbi.nlm.nih.gov/pubmed/36171569 http://dx.doi.org/10.1186/s12859-022-04933-2
work_keys_str_mv	AT zainabkazi identifyinghealthrelatedoccupationsoftwitterusersthroughwordembeddinganddeepneuralnetworks AT srivastavagautam identifyinghealthrelatedoccupationsoftwitterusersthroughwordembeddinganddeepneuralnetworks AT magovijay identifyinghealthrelatedoccupationsoftwitterusersthroughwordembeddinganddeepneuralnetworks

Identifying health related occupations of Twitter users through word embedding and deep neural networks

Ejemplares similares