Cargando…

A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter

Twitter, as a microblogging social media platform, has seen increasing applications of its data for pharmacovigilance which is to monitor and promote safe uses of pharmaceutical products. Medication names are typically used as keywords to query social media data. It is known that medication names ar...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Keyuan, Chen, Tingyu, Huang, Liyuan, Calix, Ricardo A., Bernard, Gordon R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6009827/
https://www.ncbi.nlm.nih.gov/pubmed/29677938
_version_ 1783333472945831936
author Jiang, Keyuan
Chen, Tingyu
Huang, Liyuan
Calix, Ricardo A.
Bernard, Gordon R.
author_facet Jiang, Keyuan
Chen, Tingyu
Huang, Liyuan
Calix, Ricardo A.
Bernard, Gordon R.
author_sort Jiang, Keyuan
collection PubMed
description Twitter, as a microblogging social media platform, has seen increasing applications of its data for pharmacovigilance which is to monitor and promote safe uses of pharmaceutical products. Medication names are typically used as keywords to query social media data. It is known that medication names are misspelled on social media, and finding the misspellings is challenging because there exists no a priori knowledge as to how people would misspell a medication name. We developed a data-driven, relational similarity-based approach to discover misspellings of medication names. Our approach is based upon the assumption of the identical (or similar) association of a medicine with its effects whether the medication is correctly spelled or misspelled. With distributed representations of the words in tweets posted in recent 24 months, we were able to discover a total of 54 misspellings of 6 medicines whose indications containing headache. Our search results also show that Twitter posts with misspellings of codeine and ibuprofen can be more than 10% of all the tweets associated with each of the medicines. Compared with the phonetics-based approach, our method discovered more actual misspellings used on Twitter.
format Online
Article
Text
id pubmed-6009827
institution National Center for Biotechnology Information
language English
publishDate 2018
record_format MEDLINE/PubMed
spelling pubmed-60098272018-06-20 A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter Jiang, Keyuan Chen, Tingyu Huang, Liyuan Calix, Ricardo A. Bernard, Gordon R. Stud Health Technol Inform Article Twitter, as a microblogging social media platform, has seen increasing applications of its data for pharmacovigilance which is to monitor and promote safe uses of pharmaceutical products. Medication names are typically used as keywords to query social media data. It is known that medication names are misspelled on social media, and finding the misspellings is challenging because there exists no a priori knowledge as to how people would misspell a medication name. We developed a data-driven, relational similarity-based approach to discover misspellings of medication names. Our approach is based upon the assumption of the identical (or similar) association of a medicine with its effects whether the medication is correctly spelled or misspelled. With distributed representations of the words in tweets posted in recent 24 months, we were able to discover a total of 54 misspellings of 6 medicines whose indications containing headache. Our search results also show that Twitter posts with misspellings of codeine and ibuprofen can be more than 10% of all the tweets associated with each of the medicines. Compared with the phonetics-based approach, our method discovered more actual misspellings used on Twitter. 2018 /pmc/articles/PMC6009827/ /pubmed/29677938 Text en http://creativecommons.org/licenses/by-nc/4.0/ This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
spellingShingle Article
Jiang, Keyuan
Chen, Tingyu
Huang, Liyuan
Calix, Ricardo A.
Bernard, Gordon R.
A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter
title A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter
title_full A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter
title_fullStr A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter
title_full_unstemmed A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter
title_short A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter
title_sort data-driven method of discovering misspellings of medication names on twitter
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6009827/
https://www.ncbi.nlm.nih.gov/pubmed/29677938
work_keys_str_mv AT jiangkeyuan adatadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter
AT chentingyu adatadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter
AT huangliyuan adatadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter
AT calixricardoa adatadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter
AT bernardgordonr adatadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter
AT jiangkeyuan datadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter
AT chentingyu datadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter
AT huangliyuan datadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter
AT calixricardoa datadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter
AT bernardgordonr datadrivenmethodofdiscoveringmisspellingsofmedicationnamesontwitter