Cargando…
A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter
Twitter, as a microblogging social media platform, has seen increasing applications of its data for pharmacovigilance which is to monitor and promote safe uses of pharmaceutical products. Medication names are typically used as keywords to query social media data. It is known that medication names ar...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6009827/ https://www.ncbi.nlm.nih.gov/pubmed/29677938 |
_version_ | 1783333472945831936 |
---|---|
author | Jiang, Keyuan Chen, Tingyu Huang, Liyuan Calix, Ricardo A. Bernard, Gordon R. |
author_facet | Jiang, Keyuan Chen, Tingyu Huang, Liyuan Calix, Ricardo A. Bernard, Gordon R. |
author_sort | Jiang, Keyuan |
collection | PubMed |
description | Twitter, as a microblogging social media platform, has seen increasing applications of its data for pharmacovigilance which is to monitor and promote safe uses of pharmaceutical products. Medication names are typically used as keywords to query social media data. It is known that medication names are misspelled on social media, and finding the misspellings is challenging because there exists no a priori knowledge as to how people would misspell a medication name. We developed a data-driven, relational similarity-based approach to discover misspellings of medication names. Our approach is based upon the assumption of the identical (or similar) association of a medicine with its effects whether the medication is correctly spelled or misspelled. With distributed representations of the words in tweets posted in recent 24 months, we were able to discover a total of 54 misspellings of 6 medicines whose indications containing headache. Our search results also show that Twitter posts with misspellings of codeine and ibuprofen can be more than 10% of all the tweets associated with each of the medicines. Compared with the phonetics-based approach, our method discovered more actual misspellings used on Twitter. |
format | Online Article Text |
id | pubmed-6009827 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
record_format | MEDLINE/PubMed |
spelling | pubmed-60098272018-06-20 A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter Jiang, Keyuan Chen, Tingyu Huang, Liyuan Calix, Ricardo A. Bernard, Gordon R. Stud Health Technol Inform Article Twitter, as a microblogging social media platform, has seen increasing applications of its data for pharmacovigilance which is to monitor and promote safe uses of pharmaceutical products. Medication names are typically used as keywords to query social media data. It is known that medication names are misspelled on social media, and finding the misspellings is challenging because there exists no a priori knowledge as to how people would misspell a medication name. We developed a data-driven, relational similarity-based approach to discover misspellings of medication names. Our approach is based upon the assumption of the identical (or similar) association of a medicine with its effects whether the medication is correctly spelled or misspelled. With distributed representations of the words in tweets posted in recent 24 months, we were able to discover a total of 54 misspellings of 6 medicines whose indications containing headache. Our search results also show that Twitter posts with misspellings of codeine and ibuprofen can be more than 10% of all the tweets associated with each of the medicines. Compared with the phonetics-based approach, our method discovered more actual misspellings used on Twitter. 2018 /pmc/articles/PMC6009827/ /pubmed/29677938 Text en http://creativecommons.org/licenses/by-nc/4.0/ This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0). |
spellingShingle | Article Jiang, Keyuan Chen, Tingyu Huang, Liyuan Calix, Ricardo A. Bernard, Gordon R. A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter |
title | A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter |
title_full | A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter |
title_fullStr | A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter |
title_full_unstemmed | A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter |
title_short | A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter |
title_sort | data-driven method of discovering misspellings of medication names on twitter |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6009827/ https://www.ncbi.nlm.nih.gov/pubmed/29677938 |
work_keys_str_mv | AT jiangkeyuan adatadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter AT chentingyu adatadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter AT huangliyuan adatadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter AT calixricardoa adatadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter AT bernardgordonr adatadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter AT jiangkeyuan datadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter AT chentingyu datadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter AT huangliyuan datadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter AT calixricardoa datadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter AT bernardgordonr datadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter |