Cargando…

Rule-Enhanced Active Learning for Semi-Automated Weak Supervision

A major bottleneck preventing the extension of deep learning systems to new domains is the prohibitive cost of acquiring sufficient training labels. Alternatives such as weak supervision, active learning, and fine-tuning of pretrained models reduce this burden but require substantial human input to...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kartchner, David, Nakajima An, Davi, Ren, Wendi, Zhang, Chao, Mitchell, Cassie S.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9281613/ https://www.ncbi.nlm.nih.gov/pubmed/35845102 http://dx.doi.org/10.3390/ai3010013

_version_	1784746918946013184
author	Kartchner, David Nakajima An, Davi Ren, Wendi Zhang, Chao Mitchell, Cassie S.
author_facet	Kartchner, David Nakajima An, Davi Ren, Wendi Zhang, Chao Mitchell, Cassie S.
author_sort	Kartchner, David
collection	PubMed
description	A major bottleneck preventing the extension of deep learning systems to new domains is the prohibitive cost of acquiring sufficient training labels. Alternatives such as weak supervision, active learning, and fine-tuning of pretrained models reduce this burden but require substantial human input to select a highly informative subset of instances or to curate labeling functions. REGAL (Rule-Enhanced Generative Active Learning) is an improved framework for weakly supervised text classification that performs active learning over labeling functions rather than individual instances. REGAL interactively creates high-quality labeling patterns from raw text, enabling a single annotator to accurately label an entire dataset after initialization with three keywords for each class. Experiments demonstrate that REGAL extracts up to 3 times as many high-accuracy labeling functions from text as current state-of-the-art methods for interactive weak supervision, enabling REGAL to dramatically reduce the annotation burden of writing labeling functions for weak supervision. Statistical analysis reveals REGAL performs equal or significantly better than interactive weak supervision for five of six commonly used natural language processing (NLP) baseline datasets.
format	Online Article Text
id	pubmed-9281613
institution	National Center for Biotechnology Information
language	English
publishDate	2022
record_format	MEDLINE/PubMed
spelling	pubmed-92816132022-07-14 Rule-Enhanced Active Learning for Semi-Automated Weak Supervision Kartchner, David Nakajima An, Davi Ren, Wendi Zhang, Chao Mitchell, Cassie S. Artif Intell Article A major bottleneck preventing the extension of deep learning systems to new domains is the prohibitive cost of acquiring sufficient training labels. Alternatives such as weak supervision, active learning, and fine-tuning of pretrained models reduce this burden but require substantial human input to select a highly informative subset of instances or to curate labeling functions. REGAL (Rule-Enhanced Generative Active Learning) is an improved framework for weakly supervised text classification that performs active learning over labeling functions rather than individual instances. REGAL interactively creates high-quality labeling patterns from raw text, enabling a single annotator to accurately label an entire dataset after initialization with three keywords for each class. Experiments demonstrate that REGAL extracts up to 3 times as many high-accuracy labeling functions from text as current state-of-the-art methods for interactive weak supervision, enabling REGAL to dramatically reduce the annotation burden of writing labeling functions for weak supervision. Statistical analysis reveals REGAL performs equal or significantly better than interactive weak supervision for five of six commonly used natural language processing (NLP) baseline datasets. 2022-03 2022-03-16 /pmc/articles/PMC9281613/ /pubmed/35845102 http://dx.doi.org/10.3390/ai3010013 Text en https://creativecommons.org/licenses/by/4.0/This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Kartchner, David Nakajima An, Davi Ren, Wendi Zhang, Chao Mitchell, Cassie S. Rule-Enhanced Active Learning for Semi-Automated Weak Supervision
title	Rule-Enhanced Active Learning for Semi-Automated Weak Supervision
title_full	Rule-Enhanced Active Learning for Semi-Automated Weak Supervision
title_fullStr	Rule-Enhanced Active Learning for Semi-Automated Weak Supervision
title_full_unstemmed	Rule-Enhanced Active Learning for Semi-Automated Weak Supervision
title_short	Rule-Enhanced Active Learning for Semi-Automated Weak Supervision
title_sort	rule-enhanced active learning for semi-automated weak supervision
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9281613/ https://www.ncbi.nlm.nih.gov/pubmed/35845102 http://dx.doi.org/10.3390/ai3010013
work_keys_str_mv	AT kartchnerdavid ruleenhancedactivelearningforsemiautomatedweaksupervision AT nakajimaandavi ruleenhancedactivelearningforsemiautomatedweaksupervision AT renwendi ruleenhancedactivelearningforsemiautomatedweaksupervision AT zhangchao ruleenhancedactivelearningforsemiautomatedweaksupervision AT mitchellcassies ruleenhancedactivelearningforsemiautomatedweaksupervision

Rule-Enhanced Active Learning for Semi-Automated Weak Supervision

Ejemplares similares