Distant Supervision with Transductive Learning for Adverse Drug Reaction Identification from Electronic Medical Records

Information extraction and knowledge discovery regarding adverse drug reaction (ADR) from large-scale clinical texts are very useful and needy processes. Two major difficulties of this task are the lack of domain experts for labeling examples and intractable processing of unstructured clinical texts...

Descripción completa

Detalles Bibliográficos
Autores principales: Taewijit, Siriwon, Theeramunkong, Thanaruk, Ikeda, Mitsuru
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5635478/
https://www.ncbi.nlm.nih.gov/pubmed/29090077
http://dx.doi.org/10.1155/2017/7575280
_version_ 1783270294257926144
author Taewijit, Siriwon
Theeramunkong, Thanaruk
Ikeda, Mitsuru
author_facet Taewijit, Siriwon
Theeramunkong, Thanaruk
Ikeda, Mitsuru
author_sort Taewijit, Siriwon
collection PubMed
description Information extraction and knowledge discovery regarding adverse drug reaction (ADR) from large-scale clinical texts are very useful and needy processes. Two major difficulties of this task are the lack of domain experts for labeling examples and intractable processing of unstructured clinical texts. Even though most previous works have been conducted on these issues by applying semisupervised learning for the former and a word-based approach for the latter, they face with complexity in an acquisition of initial labeled data and ignorance of structured sequence of natural language. In this study, we propose automatic data labeling by distant supervision where knowledge bases are exploited to assign an entity-level relation label for each drug-event pair in texts, and then, we use patterns for characterizing ADR relation. The multiple-instance learning with expectation-maximization method is employed to estimate model parameters. The method applies transductive learning to iteratively reassign a probability of unknown drug-event pair at the training time. By investigating experiments with 50,998 discharge summaries, we evaluate our method by varying large number of parameters, that is, pattern types, pattern-weighting models, and initial and iterative weightings of relations for unlabeled data. Based on evaluations, our proposed method outperforms the word-based feature for NB-EM (iEM), MILR, and TSVM with F1 score of 11.3%, 9.3%, and 6.5% improvement, respectively.
format Online
Article
Text
id pubmed-5635478
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-56354782017-10-31 Distant Supervision with Transductive Learning for Adverse Drug Reaction Identification from Electronic Medical Records Taewijit, Siriwon Theeramunkong, Thanaruk Ikeda, Mitsuru J Healthc Eng Research Article Information extraction and knowledge discovery regarding adverse drug reaction (ADR) from large-scale clinical texts are very useful and needy processes. Two major difficulties of this task are the lack of domain experts for labeling examples and intractable processing of unstructured clinical texts. Even though most previous works have been conducted on these issues by applying semisupervised learning for the former and a word-based approach for the latter, they face with complexity in an acquisition of initial labeled data and ignorance of structured sequence of natural language. In this study, we propose automatic data labeling by distant supervision where knowledge bases are exploited to assign an entity-level relation label for each drug-event pair in texts, and then, we use patterns for characterizing ADR relation. The multiple-instance learning with expectation-maximization method is employed to estimate model parameters. The method applies transductive learning to iteratively reassign a probability of unknown drug-event pair at the training time. By investigating experiments with 50,998 discharge summaries, we evaluate our method by varying large number of parameters, that is, pattern types, pattern-weighting models, and initial and iterative weightings of relations for unlabeled data. Based on evaluations, our proposed method outperforms the word-based feature for NB-EM (iEM), MILR, and TSVM with F1 score of 11.3%, 9.3%, and 6.5% improvement, respectively. Hindawi 2017 2017-09-26 /pmc/articles/PMC5635478/ /pubmed/29090077 http://dx.doi.org/10.1155/2017/7575280 Text en Copyright © 2017 Siriwon Taewijit et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Taewijit, Siriwon
Theeramunkong, Thanaruk
Ikeda, Mitsuru
Distant Supervision with Transductive Learning for Adverse Drug Reaction Identification from Electronic Medical Records
title Distant Supervision with Transductive Learning for Adverse Drug Reaction Identification from Electronic Medical Records
title_full Distant Supervision with Transductive Learning for Adverse Drug Reaction Identification from Electronic Medical Records
title_fullStr Distant Supervision with Transductive Learning for Adverse Drug Reaction Identification from Electronic Medical Records
title_full_unstemmed Distant Supervision with Transductive Learning for Adverse Drug Reaction Identification from Electronic Medical Records
title_short Distant Supervision with Transductive Learning for Adverse Drug Reaction Identification from Electronic Medical Records
title_sort distant supervision with transductive learning for adverse drug reaction identification from electronic medical records
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5635478/
https://www.ncbi.nlm.nih.gov/pubmed/29090077
http://dx.doi.org/10.1155/2017/7575280
work_keys_str_mv AT taewijitsiriwon distantsupervisionwithtransductivelearningforadversedrugreactionidentificationfromelectronicmedicalrecords
AT theeramunkongthanaruk distantsupervisionwithtransductivelearningforadversedrugreactionidentificationfromelectronicmedicalrecords
AT ikedamitsuru distantsupervisionwithtransductivelearningforadversedrugreactionidentificationfromelectronicmedicalrecords