Cargando…

Automatic Identification of Messages Related to Adverse Drug Reactions from Online User Reviews using Feature-based Classification

BACKGROUND: User-generated medical messages on Internet contain extensive information related to adverse drug reactions (ADRs) and are known as valuable resources for post-marketing drug surveillance. The aim of this study was to find an effective method to identify messages related to ADRs automati...

Descripción completa

Detalles Bibliográficos
Autores principales: LIU, Jingfang, ZHANG, Pengzhu, LU, Yingjie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Tehran University of Medical Sciences 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4449501/
https://www.ncbi.nlm.nih.gov/pubmed/26060719
_version_ 1782373862568099840
author LIU, Jingfang
ZHANG, Pengzhu
LU, Yingjie
author_facet LIU, Jingfang
ZHANG, Pengzhu
LU, Yingjie
author_sort LIU, Jingfang
collection PubMed
description BACKGROUND: User-generated medical messages on Internet contain extensive information related to adverse drug reactions (ADRs) and are known as valuable resources for post-marketing drug surveillance. The aim of this study was to find an effective method to identify messages related to ADRs automatically from online user reviews. METHODS: We conducted experiments on online user reviews using different feature set and different classification technique. Firstly, the messages from three communities, allergy community, schizophrenia community and pain management community, were collected, the 3000 messages were annotated. Secondly, the N-gram-based features set and medical domain-specific features set were generated. Thirdly, three classification techniques, SVM, C4.5 and Naïve Bayes, were used to perform classification tasks separately. Finally, we evaluated the performance of different method using different feature set and different classification technique by comparing the metrics including accuracy and F-measure. RESULTS: In terms of accuracy, the accuracy of SVM classifier was higher than 0.8, the accuracy of C4.5 classifier or Naïve Bayes classifier was lower than 0.8; meanwhile, the combination feature sets including n-gram-based feature set and domain-specific feature set consistently outperformed single feature set. In terms of F-measure, the highest F-measure is 0.895 which was achieved by using combination feature sets and a SVM classifier. In all, we can get the best classification performance by using combination feature sets and SVM classifier. CONCLUSION: By using combination feature sets and SVM classifier, we can get an effective method to identify messages related to ADRs automatically from online user reviews.
format Online
Article
Text
id pubmed-4449501
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Tehran University of Medical Sciences
record_format MEDLINE/PubMed
spelling pubmed-44495012015-06-09 Automatic Identification of Messages Related to Adverse Drug Reactions from Online User Reviews using Feature-based Classification LIU, Jingfang ZHANG, Pengzhu LU, Yingjie Iran J Public Health Original Article BACKGROUND: User-generated medical messages on Internet contain extensive information related to adverse drug reactions (ADRs) and are known as valuable resources for post-marketing drug surveillance. The aim of this study was to find an effective method to identify messages related to ADRs automatically from online user reviews. METHODS: We conducted experiments on online user reviews using different feature set and different classification technique. Firstly, the messages from three communities, allergy community, schizophrenia community and pain management community, were collected, the 3000 messages were annotated. Secondly, the N-gram-based features set and medical domain-specific features set were generated. Thirdly, three classification techniques, SVM, C4.5 and Naïve Bayes, were used to perform classification tasks separately. Finally, we evaluated the performance of different method using different feature set and different classification technique by comparing the metrics including accuracy and F-measure. RESULTS: In terms of accuracy, the accuracy of SVM classifier was higher than 0.8, the accuracy of C4.5 classifier or Naïve Bayes classifier was lower than 0.8; meanwhile, the combination feature sets including n-gram-based feature set and domain-specific feature set consistently outperformed single feature set. In terms of F-measure, the highest F-measure is 0.895 which was achieved by using combination feature sets and a SVM classifier. In all, we can get the best classification performance by using combination feature sets and SVM classifier. CONCLUSION: By using combination feature sets and SVM classifier, we can get an effective method to identify messages related to ADRs automatically from online user reviews. Tehran University of Medical Sciences 2014-11 /pmc/articles/PMC4449501/ /pubmed/26060719 Text en Copyright © Iranian Public Health Association & Tehran University of Medical Sciences This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License which allows users to read, copy, distribute and make derivative works for non-commercial purposes from the material, as long as the author of the original work is cited properly.
spellingShingle Original Article
LIU, Jingfang
ZHANG, Pengzhu
LU, Yingjie
Automatic Identification of Messages Related to Adverse Drug Reactions from Online User Reviews using Feature-based Classification
title Automatic Identification of Messages Related to Adverse Drug Reactions from Online User Reviews using Feature-based Classification
title_full Automatic Identification of Messages Related to Adverse Drug Reactions from Online User Reviews using Feature-based Classification
title_fullStr Automatic Identification of Messages Related to Adverse Drug Reactions from Online User Reviews using Feature-based Classification
title_full_unstemmed Automatic Identification of Messages Related to Adverse Drug Reactions from Online User Reviews using Feature-based Classification
title_short Automatic Identification of Messages Related to Adverse Drug Reactions from Online User Reviews using Feature-based Classification
title_sort automatic identification of messages related to adverse drug reactions from online user reviews using feature-based classification
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4449501/
https://www.ncbi.nlm.nih.gov/pubmed/26060719
work_keys_str_mv AT liujingfang automaticidentificationofmessagesrelatedtoadversedrugreactionsfromonlineuserreviewsusingfeaturebasedclassification
AT zhangpengzhu automaticidentificationofmessagesrelatedtoadversedrugreactionsfromonlineuserreviewsusingfeaturebasedclassification
AT luyingjie automaticidentificationofmessagesrelatedtoadversedrugreactionsfromonlineuserreviewsusingfeaturebasedclassification