Cargando…

Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help?

BACKGROUND: With the increasing popularity of Web 2.0 applications, social media has made it possible for individuals to post messages on adverse drug reactions. In such online conversations, patients discuss their symptoms, medical history, and diseases. These disorders may correspond to adverse dr...

Descripción completa

Detalles Bibliográficos
Autores principales: Abdellaoui, Redhouane, Schück, Stéphane, Texier, Nathalie, Burgun, Anita
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5500778/
https://www.ncbi.nlm.nih.gov/pubmed/28642212
http://dx.doi.org/10.2196/publichealth.6577
_version_ 1783248700199403520
author Abdellaoui, Redhouane
Schück, Stéphane
Texier, Nathalie
Burgun, Anita
author_facet Abdellaoui, Redhouane
Schück, Stéphane
Texier, Nathalie
Burgun, Anita
author_sort Abdellaoui, Redhouane
collection PubMed
description BACKGROUND: With the increasing popularity of Web 2.0 applications, social media has made it possible for individuals to post messages on adverse drug reactions. In such online conversations, patients discuss their symptoms, medical history, and diseases. These disorders may correspond to adverse drug reactions (ADRs) or any other medical condition. Therefore, methods must be developed to distinguish between false positives and true ADR declarations. OBJECTIVE: The aim of this study was to investigate a method for filtering out disorder terms that did not correspond to adverse events by using the distance (as number of words) between the drug term and the disorder or symptom term in the post. We hypothesized that the shorter the distance between the disorder name and the drug, the higher the probability to be an ADR. METHODS: We analyzed a corpus of 648 messages corresponding to a total of 1654 (drug and disorder) pairs from 5 French forums using Gaussian mixture models and an expectation-maximization (EM) algorithm . RESULTS: The distribution of the distances between the drug term and the disorder term enabled the filtering of 50.03% (733/1465) of the disorders that were not ADRs. Our filtering strategy achieved a precision of 95.8% and a recall of 50.0%. CONCLUSIONS: This study suggests that such distance between terms can be used for identifying false positives, thereby improving ADR detection in social media.
format Online
Article
Text
id pubmed-5500778
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-55007782017-07-26 Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help? Abdellaoui, Redhouane Schück, Stéphane Texier, Nathalie Burgun, Anita JMIR Public Health Surveill Original Paper BACKGROUND: With the increasing popularity of Web 2.0 applications, social media has made it possible for individuals to post messages on adverse drug reactions. In such online conversations, patients discuss their symptoms, medical history, and diseases. These disorders may correspond to adverse drug reactions (ADRs) or any other medical condition. Therefore, methods must be developed to distinguish between false positives and true ADR declarations. OBJECTIVE: The aim of this study was to investigate a method for filtering out disorder terms that did not correspond to adverse events by using the distance (as number of words) between the drug term and the disorder or symptom term in the post. We hypothesized that the shorter the distance between the disorder name and the drug, the higher the probability to be an ADR. METHODS: We analyzed a corpus of 648 messages corresponding to a total of 1654 (drug and disorder) pairs from 5 French forums using Gaussian mixture models and an expectation-maximization (EM) algorithm . RESULTS: The distribution of the distances between the drug term and the disorder term enabled the filtering of 50.03% (733/1465) of the disorders that were not ADRs. Our filtering strategy achieved a precision of 95.8% and a recall of 50.0%. CONCLUSIONS: This study suggests that such distance between terms can be used for identifying false positives, thereby improving ADR detection in social media. JMIR Publications 2017-06-22 /pmc/articles/PMC5500778/ /pubmed/28642212 http://dx.doi.org/10.2196/publichealth.6577 Text en ©Redhouane Abdellaoui, Stéphane Schück, Nathalie Texier, Anita Burgun. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 22.06.2017. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included.
spellingShingle Original Paper
Abdellaoui, Redhouane
Schück, Stéphane
Texier, Nathalie
Burgun, Anita
Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help?
title Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help?
title_full Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help?
title_fullStr Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help?
title_full_unstemmed Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help?
title_short Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help?
title_sort filtering entities to optimize identification of adverse drug reaction from social media: how can the number of words between entities in the messages help?
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5500778/
https://www.ncbi.nlm.nih.gov/pubmed/28642212
http://dx.doi.org/10.2196/publichealth.6577
work_keys_str_mv AT abdellaouiredhouane filteringentitiestooptimizeidentificationofadversedrugreactionfromsocialmediahowcanthenumberofwordsbetweenentitiesinthemessageshelp
AT schuckstephane filteringentitiestooptimizeidentificationofadversedrugreactionfromsocialmediahowcanthenumberofwordsbetweenentitiesinthemessageshelp
AT texiernathalie filteringentitiestooptimizeidentificationofadversedrugreactionfromsocialmediahowcanthenumberofwordsbetweenentitiesinthemessageshelp
AT burgunanita filteringentitiestooptimizeidentificationofadversedrugreactionfromsocialmediahowcanthenumberofwordsbetweenentitiesinthemessageshelp