Cargando…

Detecting Personal Medication Intake in Twitter via Domain Attention-Based RNN with Multi-Level Features

Personal medication intake detection aims to automatically detect tweets that show clear evidence of personal medication consumption. It is a research topic that has attracted considerable attention to drug safety surveillance. This task is inevitably dependent on medical domain information, and the...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiong, Shufeng, Batra, Vishwash, Liu, Liangliang, Xi, Lei, Sun, Changxia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9381240/
https://www.ncbi.nlm.nih.gov/pubmed/35983151
http://dx.doi.org/10.1155/2022/5467262
Descripción
Sumario:Personal medication intake detection aims to automatically detect tweets that show clear evidence of personal medication consumption. It is a research topic that has attracted considerable attention to drug safety surveillance. This task is inevitably dependent on medical domain information, and the current main model for this task does not explicitly consider domain information. To tackle this problem, we propose a domain attention mechanism for recurrent neural networks, LSTMs, with a multi-level feature representation of Twitter data. Specifically, we utilize character-level CNN to capture morphological features at the word level. Subsequently, we feed them with word embeddings into a BiLSTM to get the hidden representation of a tweet. An attention mechanism is introduced over the hidden state of the BiLSTM to attend to special medical information. Finally, a classification is performed on the weighted hidden representation of tweets. Experiments over a publicly available benchmark dataset show that our model can exploit a domain attention mechanism to consider medical information to improve performance. For example, our approach achieves a precision score of 0.708, a recall score of 0.694, and a F1 score of 0.697, which is significantly outperforming multiple strong and relevant baselines.