Cargando…

Deep neural networks ensemble for detecting medication mentions in tweets

OBJECTIVE: Twitter posts are now recognized as an important source of patient-generated data, providing unique insights into population health. A fundamental step toward incorporating Twitter data in pharmacoepidemiologic research is to automatically recognize medication mentions in tweets. Given th...

Descripción completa

Detalles Bibliográficos
Autores principales: Weissenbacher, Davy, Sarker, Abeed, Klein, Ari, O’Connor, Karen, Magge, Arjun, Gonzalez-Hernandez, Graciela
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6857507/
https://www.ncbi.nlm.nih.gov/pubmed/31562510
http://dx.doi.org/10.1093/jamia/ocz156
_version_ 1783470785459912704
author Weissenbacher, Davy
Sarker, Abeed
Klein, Ari
O’Connor, Karen
Magge, Arjun
Gonzalez-Hernandez, Graciela
author_facet Weissenbacher, Davy
Sarker, Abeed
Klein, Ari
O’Connor, Karen
Magge, Arjun
Gonzalez-Hernandez, Graciela
author_sort Weissenbacher, Davy
collection PubMed
description OBJECTIVE: Twitter posts are now recognized as an important source of patient-generated data, providing unique insights into population health. A fundamental step toward incorporating Twitter data in pharmacoepidemiologic research is to automatically recognize medication mentions in tweets. Given that lexical searches for medication names suffer from low recall due to misspellings or ambiguity with common words, we propose a more advanced method to recognize them. MATERIALS AND METHODS: We present Kusuri, an Ensemble Learning classifier able to identify tweets mentioning drug products and dietary supplements. Kusuri (薬, “medication” in Japanese) is composed of 2 modules: first, 4 different classifiers (lexicon based, spelling variant based, pattern based, and a weakly trained neural network) are applied in parallel to discover tweets potentially containing medication names; second, an ensemble of deep neural networks encoding morphological, semantic, and long-range dependencies of important words in the tweets makes the final decision. RESULTS: On a class-balanced (50-50) corpus of 15 005 tweets, Kusuri demonstrated performances close to human annotators with an F(1) score of 93.7%, the best score achieved thus far on this corpus. On a corpus made of all tweets posted by 112 Twitter users (98 959 tweets, with only 0.26% mentioning medications), Kusuri obtained an F(1) score of 78.8%. To the best of our knowledge, Kusuri is the first system to achieve this score on such an extremely imbalanced dataset. CONCLUSIONS: The system identifies tweets mentioning drug names with performance high enough to ensure its usefulness, and is ready to be integrated in pharmacovigilance, toxicovigilance, or more generally, public health pipelines that depend on medication name mentions.
format Online
Article
Text
id pubmed-6857507
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-68575072019-11-20 Deep neural networks ensemble for detecting medication mentions in tweets Weissenbacher, Davy Sarker, Abeed Klein, Ari O’Connor, Karen Magge, Arjun Gonzalez-Hernandez, Graciela J Am Med Inform Assoc Research and Applications OBJECTIVE: Twitter posts are now recognized as an important source of patient-generated data, providing unique insights into population health. A fundamental step toward incorporating Twitter data in pharmacoepidemiologic research is to automatically recognize medication mentions in tweets. Given that lexical searches for medication names suffer from low recall due to misspellings or ambiguity with common words, we propose a more advanced method to recognize them. MATERIALS AND METHODS: We present Kusuri, an Ensemble Learning classifier able to identify tweets mentioning drug products and dietary supplements. Kusuri (薬, “medication” in Japanese) is composed of 2 modules: first, 4 different classifiers (lexicon based, spelling variant based, pattern based, and a weakly trained neural network) are applied in parallel to discover tweets potentially containing medication names; second, an ensemble of deep neural networks encoding morphological, semantic, and long-range dependencies of important words in the tweets makes the final decision. RESULTS: On a class-balanced (50-50) corpus of 15 005 tweets, Kusuri demonstrated performances close to human annotators with an F(1) score of 93.7%, the best score achieved thus far on this corpus. On a corpus made of all tweets posted by 112 Twitter users (98 959 tweets, with only 0.26% mentioning medications), Kusuri obtained an F(1) score of 78.8%. To the best of our knowledge, Kusuri is the first system to achieve this score on such an extremely imbalanced dataset. CONCLUSIONS: The system identifies tweets mentioning drug names with performance high enough to ensure its usefulness, and is ready to be integrated in pharmacovigilance, toxicovigilance, or more generally, public health pipelines that depend on medication name mentions. Oxford University Press 2019-09-27 /pmc/articles/PMC6857507/ /pubmed/31562510 http://dx.doi.org/10.1093/jamia/ocz156 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research and Applications
Weissenbacher, Davy
Sarker, Abeed
Klein, Ari
O’Connor, Karen
Magge, Arjun
Gonzalez-Hernandez, Graciela
Deep neural networks ensemble for detecting medication mentions in tweets
title Deep neural networks ensemble for detecting medication mentions in tweets
title_full Deep neural networks ensemble for detecting medication mentions in tweets
title_fullStr Deep neural networks ensemble for detecting medication mentions in tweets
title_full_unstemmed Deep neural networks ensemble for detecting medication mentions in tweets
title_short Deep neural networks ensemble for detecting medication mentions in tweets
title_sort deep neural networks ensemble for detecting medication mentions in tweets
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6857507/
https://www.ncbi.nlm.nih.gov/pubmed/31562510
http://dx.doi.org/10.1093/jamia/ocz156
work_keys_str_mv AT weissenbacherdavy deepneuralnetworksensemblefordetectingmedicationmentionsintweets
AT sarkerabeed deepneuralnetworksensemblefordetectingmedicationmentionsintweets
AT kleinari deepneuralnetworksensemblefordetectingmedicationmentionsintweets
AT oconnorkaren deepneuralnetworksensemblefordetectingmedicationmentionsintweets
AT maggearjun deepneuralnetworksensemblefordetectingmedicationmentionsintweets
AT gonzalezhernandezgraciela deepneuralnetworksensemblefordetectingmedicationmentionsintweets