Cargando…
Phonetic Spelling Filter for Keyword Selection in Drug Mention Mining from Social Media
Social media postings are rich in information that often remain hidden and inaccessible for automatic extraction due to inherent limitations of the site’s APIs, which mostly limit access via specific keyword-based searches (and limit both the number of keywords and the number of postings that are re...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Medical Informatics Association
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4333687/ https://www.ncbi.nlm.nih.gov/pubmed/25717407 |
_version_ | 1782358082907537408 |
---|---|
author | Pimpalkhute, Pranoti Patki, Apurv Nikfarjam, Azadeh Gonzalez, Graciela |
author_facet | Pimpalkhute, Pranoti Patki, Apurv Nikfarjam, Azadeh Gonzalez, Graciela |
author_sort | Pimpalkhute, Pranoti |
collection | PubMed |
description | Social media postings are rich in information that often remain hidden and inaccessible for automatic extraction due to inherent limitations of the site’s APIs, which mostly limit access via specific keyword-based searches (and limit both the number of keywords and the number of postings that are returned). When mining social media for drug mentions, one of the first problems to solve is how to derive a list of variants of the drug name (common misspellings) that can capture a sufficient number of postings. We present here an approach that filters the potential variants based on the intuition that, faced with the task of writing an unfamiliar, complex word (the drug name), users will tend to revert to phonetic spelling, and we thus give preference to variants that reflect the phonemes of the correct spelling. The algorithm allowed us to capture 50.4 – 56.0 % of the user comments using only about 18% of the variants. |
format | Online Article Text |
id | pubmed-4333687 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | American Medical Informatics Association |
record_format | MEDLINE/PubMed |
spelling | pubmed-43336872015-02-25 Phonetic Spelling Filter for Keyword Selection in Drug Mention Mining from Social Media Pimpalkhute, Pranoti Patki, Apurv Nikfarjam, Azadeh Gonzalez, Graciela AMIA Jt Summits Transl Sci Proc Articles Social media postings are rich in information that often remain hidden and inaccessible for automatic extraction due to inherent limitations of the site’s APIs, which mostly limit access via specific keyword-based searches (and limit both the number of keywords and the number of postings that are returned). When mining social media for drug mentions, one of the first problems to solve is how to derive a list of variants of the drug name (common misspellings) that can capture a sufficient number of postings. We present here an approach that filters the potential variants based on the intuition that, faced with the task of writing an unfamiliar, complex word (the drug name), users will tend to revert to phonetic spelling, and we thus give preference to variants that reflect the phonemes of the correct spelling. The algorithm allowed us to capture 50.4 – 56.0 % of the user comments using only about 18% of the variants. American Medical Informatics Association 2014-04-07 /pmc/articles/PMC4333687/ /pubmed/25717407 Text en ©2014 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose |
spellingShingle | Articles Pimpalkhute, Pranoti Patki, Apurv Nikfarjam, Azadeh Gonzalez, Graciela Phonetic Spelling Filter for Keyword Selection in Drug Mention Mining from Social Media |
title | Phonetic Spelling Filter for Keyword Selection in Drug Mention Mining from Social Media |
title_full | Phonetic Spelling Filter for Keyword Selection in Drug Mention Mining from Social Media |
title_fullStr | Phonetic Spelling Filter for Keyword Selection in Drug Mention Mining from Social Media |
title_full_unstemmed | Phonetic Spelling Filter for Keyword Selection in Drug Mention Mining from Social Media |
title_short | Phonetic Spelling Filter for Keyword Selection in Drug Mention Mining from Social Media |
title_sort | phonetic spelling filter for keyword selection in drug mention mining from social media |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4333687/ https://www.ncbi.nlm.nih.gov/pubmed/25717407 |
work_keys_str_mv | AT pimpalkhutepranoti phoneticspellingfilterforkeywordselectionindrugmentionminingfromsocialmedia AT patkiapurv phoneticspellingfilterforkeywordselectionindrugmentionminingfromsocialmedia AT nikfarjamazadeh phoneticspellingfilterforkeywordselectionindrugmentionminingfromsocialmedia AT gonzalezgraciela phoneticspellingfilterforkeywordselectionindrugmentionminingfromsocialmedia |