Cargando…

Using distant supervision to augment manually annotated data for relation extraction

Significant progress has been made in applying deep learning on natural language processing tasks recently. However, deep learning models typically require a large amount of annotated training data while often only small labeled datasets are available for many natural language processing tasks in bi...

Descripción completa

Detalles Bibliográficos
Autores principales: Su, Peng, Li, Gang, Wu, Cathy, Vijay-Shanker, K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6667146/
https://www.ncbi.nlm.nih.gov/pubmed/31361753
http://dx.doi.org/10.1371/journal.pone.0216913
_version_ 1783440005641797632
author Su, Peng
Li, Gang
Wu, Cathy
Vijay-Shanker, K.
author_facet Su, Peng
Li, Gang
Wu, Cathy
Vijay-Shanker, K.
author_sort Su, Peng
collection PubMed
description Significant progress has been made in applying deep learning on natural language processing tasks recently. However, deep learning models typically require a large amount of annotated training data while often only small labeled datasets are available for many natural language processing tasks in biomedical literature. Building large-size datasets for deep learning is expensive since it involves considerable human effort and usually requires domain expertise in specialized fields. In this work, we consider augmenting manually annotated data with large amounts of data using distant supervision. However, data obtained by distant supervision is often noisy, we first apply some heuristics to remove some of the incorrect annotations. Then using methods inspired from transfer learning, we show that the resulting models outperform models trained on the original manually annotated sets.
format Online
Article
Text
id pubmed-6667146
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-66671462019-08-07 Using distant supervision to augment manually annotated data for relation extraction Su, Peng Li, Gang Wu, Cathy Vijay-Shanker, K. PLoS One Research Article Significant progress has been made in applying deep learning on natural language processing tasks recently. However, deep learning models typically require a large amount of annotated training data while often only small labeled datasets are available for many natural language processing tasks in biomedical literature. Building large-size datasets for deep learning is expensive since it involves considerable human effort and usually requires domain expertise in specialized fields. In this work, we consider augmenting manually annotated data with large amounts of data using distant supervision. However, data obtained by distant supervision is often noisy, we first apply some heuristics to remove some of the incorrect annotations. Then using methods inspired from transfer learning, we show that the resulting models outperform models trained on the original manually annotated sets. Public Library of Science 2019-07-30 /pmc/articles/PMC6667146/ /pubmed/31361753 http://dx.doi.org/10.1371/journal.pone.0216913 Text en © 2019 Su et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Su, Peng
Li, Gang
Wu, Cathy
Vijay-Shanker, K.
Using distant supervision to augment manually annotated data for relation extraction
title Using distant supervision to augment manually annotated data for relation extraction
title_full Using distant supervision to augment manually annotated data for relation extraction
title_fullStr Using distant supervision to augment manually annotated data for relation extraction
title_full_unstemmed Using distant supervision to augment manually annotated data for relation extraction
title_short Using distant supervision to augment manually annotated data for relation extraction
title_sort using distant supervision to augment manually annotated data for relation extraction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6667146/
https://www.ncbi.nlm.nih.gov/pubmed/31361753
http://dx.doi.org/10.1371/journal.pone.0216913
work_keys_str_mv AT supeng usingdistantsupervisiontoaugmentmanuallyannotateddataforrelationextraction
AT ligang usingdistantsupervisiontoaugmentmanuallyannotateddataforrelationextraction
AT wucathy usingdistantsupervisiontoaugmentmanuallyannotateddataforrelationextraction
AT vijayshankerk usingdistantsupervisiontoaugmentmanuallyannotateddataforrelationextraction