Cargando…

Boosting Knowledge Base Automatically via Few-Shot Relation Classification

Relation classification (RC) aims at extracting structural information, i.e., triplets of two entities with a relation, from free texts, which is pivotal for automatic knowledge base construction. In this paper, we investigate a fully automatic method to train a RC model which facilitates to boost t...

Descripción completa

Detalles Bibliográficos
Autores principales: Pang, Ning, Tan, Zhen, Xu, Hao, Xiao, Weidong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7652790/
https://www.ncbi.nlm.nih.gov/pubmed/33192439
http://dx.doi.org/10.3389/fnbot.2020.584192
_version_ 1783607764999733248
author Pang, Ning
Tan, Zhen
Xu, Hao
Xiao, Weidong
author_facet Pang, Ning
Tan, Zhen
Xu, Hao
Xiao, Weidong
author_sort Pang, Ning
collection PubMed
description Relation classification (RC) aims at extracting structural information, i.e., triplets of two entities with a relation, from free texts, which is pivotal for automatic knowledge base construction. In this paper, we investigate a fully automatic method to train a RC model which facilitates to boost the knowledge base. Traditional RC models cannot extract new relations unseen during training since they define RC as a multiclass classification problem. The recent development of few-shot learning (FSL) provides a feasible way to accommodate to fresh relation types with a handful of examples. However, it requires a moderately large amount of training data to learn a promising few-shot RC model, which consumes expensive human labor. This issue recalls a kind of weak supervision methods, dubbed distant supervision (DS), which can generate the training data automatically. To this end, we propose to investigate the task of few-shot relation classification under distant supervision. As DS naturally brings in mislabeled training instances, to alleviate the negative impact, we incorporate various multiple instance learning methods into the classic prototypical networks, which can achieve sentence-level noise reduction. In experiments, we evaluate our proposed model under the standard N-way K-shot setting of few-shot learning. The experiment results show that our proposal achieves better performance.
format Online
Article
Text
id pubmed-7652790
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-76527902020-11-13 Boosting Knowledge Base Automatically via Few-Shot Relation Classification Pang, Ning Tan, Zhen Xu, Hao Xiao, Weidong Front Neurorobot Neuroscience Relation classification (RC) aims at extracting structural information, i.e., triplets of two entities with a relation, from free texts, which is pivotal for automatic knowledge base construction. In this paper, we investigate a fully automatic method to train a RC model which facilitates to boost the knowledge base. Traditional RC models cannot extract new relations unseen during training since they define RC as a multiclass classification problem. The recent development of few-shot learning (FSL) provides a feasible way to accommodate to fresh relation types with a handful of examples. However, it requires a moderately large amount of training data to learn a promising few-shot RC model, which consumes expensive human labor. This issue recalls a kind of weak supervision methods, dubbed distant supervision (DS), which can generate the training data automatically. To this end, we propose to investigate the task of few-shot relation classification under distant supervision. As DS naturally brings in mislabeled training instances, to alleviate the negative impact, we incorporate various multiple instance learning methods into the classic prototypical networks, which can achieve sentence-level noise reduction. In experiments, we evaluate our proposed model under the standard N-way K-shot setting of few-shot learning. The experiment results show that our proposal achieves better performance. Frontiers Media S.A. 2020-10-27 /pmc/articles/PMC7652790/ /pubmed/33192439 http://dx.doi.org/10.3389/fnbot.2020.584192 Text en Copyright © 2020 Pang, Tan, Xu and Xiao. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Pang, Ning
Tan, Zhen
Xu, Hao
Xiao, Weidong
Boosting Knowledge Base Automatically via Few-Shot Relation Classification
title Boosting Knowledge Base Automatically via Few-Shot Relation Classification
title_full Boosting Knowledge Base Automatically via Few-Shot Relation Classification
title_fullStr Boosting Knowledge Base Automatically via Few-Shot Relation Classification
title_full_unstemmed Boosting Knowledge Base Automatically via Few-Shot Relation Classification
title_short Boosting Knowledge Base Automatically via Few-Shot Relation Classification
title_sort boosting knowledge base automatically via few-shot relation classification
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7652790/
https://www.ncbi.nlm.nih.gov/pubmed/33192439
http://dx.doi.org/10.3389/fnbot.2020.584192
work_keys_str_mv AT pangning boostingknowledgebaseautomaticallyviafewshotrelationclassification
AT tanzhen boostingknowledgebaseautomaticallyviafewshotrelationclassification
AT xuhao boostingknowledgebaseautomaticallyviafewshotrelationclassification
AT xiaoweidong boostingknowledgebaseautomaticallyviafewshotrelationclassification