Cargando…
Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help?
BACKGROUND: With the increasing popularity of Web 2.0 applications, social media has made it possible for individuals to post messages on adverse drug reactions. In such online conversations, patients discuss their symptoms, medical history, and diseases. These disorders may correspond to adverse dr...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5500778/ https://www.ncbi.nlm.nih.gov/pubmed/28642212 http://dx.doi.org/10.2196/publichealth.6577 |
_version_ | 1783248700199403520 |
---|---|
author | Abdellaoui, Redhouane Schück, Stéphane Texier, Nathalie Burgun, Anita |
author_facet | Abdellaoui, Redhouane Schück, Stéphane Texier, Nathalie Burgun, Anita |
author_sort | Abdellaoui, Redhouane |
collection | PubMed |
description | BACKGROUND: With the increasing popularity of Web 2.0 applications, social media has made it possible for individuals to post messages on adverse drug reactions. In such online conversations, patients discuss their symptoms, medical history, and diseases. These disorders may correspond to adverse drug reactions (ADRs) or any other medical condition. Therefore, methods must be developed to distinguish between false positives and true ADR declarations. OBJECTIVE: The aim of this study was to investigate a method for filtering out disorder terms that did not correspond to adverse events by using the distance (as number of words) between the drug term and the disorder or symptom term in the post. We hypothesized that the shorter the distance between the disorder name and the drug, the higher the probability to be an ADR. METHODS: We analyzed a corpus of 648 messages corresponding to a total of 1654 (drug and disorder) pairs from 5 French forums using Gaussian mixture models and an expectation-maximization (EM) algorithm . RESULTS: The distribution of the distances between the drug term and the disorder term enabled the filtering of 50.03% (733/1465) of the disorders that were not ADRs. Our filtering strategy achieved a precision of 95.8% and a recall of 50.0%. CONCLUSIONS: This study suggests that such distance between terms can be used for identifying false positives, thereby improving ADR detection in social media. |
format | Online Article Text |
id | pubmed-5500778 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-55007782017-07-26 Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help? Abdellaoui, Redhouane Schück, Stéphane Texier, Nathalie Burgun, Anita JMIR Public Health Surveill Original Paper BACKGROUND: With the increasing popularity of Web 2.0 applications, social media has made it possible for individuals to post messages on adverse drug reactions. In such online conversations, patients discuss their symptoms, medical history, and diseases. These disorders may correspond to adverse drug reactions (ADRs) or any other medical condition. Therefore, methods must be developed to distinguish between false positives and true ADR declarations. OBJECTIVE: The aim of this study was to investigate a method for filtering out disorder terms that did not correspond to adverse events by using the distance (as number of words) between the drug term and the disorder or symptom term in the post. We hypothesized that the shorter the distance between the disorder name and the drug, the higher the probability to be an ADR. METHODS: We analyzed a corpus of 648 messages corresponding to a total of 1654 (drug and disorder) pairs from 5 French forums using Gaussian mixture models and an expectation-maximization (EM) algorithm . RESULTS: The distribution of the distances between the drug term and the disorder term enabled the filtering of 50.03% (733/1465) of the disorders that were not ADRs. Our filtering strategy achieved a precision of 95.8% and a recall of 50.0%. CONCLUSIONS: This study suggests that such distance between terms can be used for identifying false positives, thereby improving ADR detection in social media. JMIR Publications 2017-06-22 /pmc/articles/PMC5500778/ /pubmed/28642212 http://dx.doi.org/10.2196/publichealth.6577 Text en ©Redhouane Abdellaoui, Stéphane Schück, Nathalie Texier, Anita Burgun. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 22.06.2017. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Abdellaoui, Redhouane Schück, Stéphane Texier, Nathalie Burgun, Anita Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help? |
title | Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help? |
title_full | Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help? |
title_fullStr | Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help? |
title_full_unstemmed | Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help? |
title_short | Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help? |
title_sort | filtering entities to optimize identification of adverse drug reaction from social media: how can the number of words between entities in the messages help? |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5500778/ https://www.ncbi.nlm.nih.gov/pubmed/28642212 http://dx.doi.org/10.2196/publichealth.6577 |
work_keys_str_mv | AT abdellaouiredhouane filteringentitiestooptimizeidentificationofadversedrugreactionfromsocialmediahowcanthenumberofwordsbetweenentitiesinthemessageshelp AT schuckstephane filteringentitiestooptimizeidentificationofadversedrugreactionfromsocialmediahowcanthenumberofwordsbetweenentitiesinthemessageshelp AT texiernathalie filteringentitiestooptimizeidentificationofadversedrugreactionfromsocialmediahowcanthenumberofwordsbetweenentitiesinthemessageshelp AT burgunanita filteringentitiestooptimizeidentificationofadversedrugreactionfromsocialmediahowcanthenumberofwordsbetweenentitiesinthemessageshelp |