Cargando…

Active Learning for Node Classification: An Evaluation

Current breakthroughs in the field of machine learning are fueled by the deployment of deep neural network models. Deep neural networks models are notorious for their dependence on large amounts of labeled data for training them. Active learning is being used as a solution to train classification mo...

Descripción completa

Detalles Bibliográficos
Autores principales: Madhawa, Kaushalya, Murata, Tsuyoshi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7597335/
https://www.ncbi.nlm.nih.gov/pubmed/33286933
http://dx.doi.org/10.3390/e22101164
_version_ 1783602325069234176
author Madhawa, Kaushalya
Murata, Tsuyoshi
author_facet Madhawa, Kaushalya
Murata, Tsuyoshi
author_sort Madhawa, Kaushalya
collection PubMed
description Current breakthroughs in the field of machine learning are fueled by the deployment of deep neural network models. Deep neural networks models are notorious for their dependence on large amounts of labeled data for training them. Active learning is being used as a solution to train classification models with less labeled instances by selecting only the most informative instances for labeling. This is especially important when the labeled data are scarce or the labeling process is expensive. In this paper, we study the application of active learning on attributed graphs. In this setting, the data instances are represented as nodes of an attributed graph. Graph neural networks achieve the current state-of-the-art classification performance on attributed graphs. The performance of graph neural networks relies on the careful tuning of their hyperparameters, usually performed using a validation set, an additional set of labeled instances. In label scarce problems, it is realistic to use all labeled instances for training the model. In this setting, we perform a fair comparison of the existing active learning algorithms proposed for graph neural networks as well as other data types such as images and text. With empirical results, we demonstrate that state-of-the-art active learning algorithms designed for other data types do not perform well on graph-structured data. We study the problem within the framework of the exploration-vs.-exploitation trade-off and propose a new count-based exploration term. With empirical evidence on multiple benchmark graphs, we highlight the importance of complementing uncertainty-based active learning models with an exploration term.
format Online
Article
Text
id pubmed-7597335
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75973352020-11-09 Active Learning for Node Classification: An Evaluation Madhawa, Kaushalya Murata, Tsuyoshi Entropy (Basel) Article Current breakthroughs in the field of machine learning are fueled by the deployment of deep neural network models. Deep neural networks models are notorious for their dependence on large amounts of labeled data for training them. Active learning is being used as a solution to train classification models with less labeled instances by selecting only the most informative instances for labeling. This is especially important when the labeled data are scarce or the labeling process is expensive. In this paper, we study the application of active learning on attributed graphs. In this setting, the data instances are represented as nodes of an attributed graph. Graph neural networks achieve the current state-of-the-art classification performance on attributed graphs. The performance of graph neural networks relies on the careful tuning of their hyperparameters, usually performed using a validation set, an additional set of labeled instances. In label scarce problems, it is realistic to use all labeled instances for training the model. In this setting, we perform a fair comparison of the existing active learning algorithms proposed for graph neural networks as well as other data types such as images and text. With empirical results, we demonstrate that state-of-the-art active learning algorithms designed for other data types do not perform well on graph-structured data. We study the problem within the framework of the exploration-vs.-exploitation trade-off and propose a new count-based exploration term. With empirical evidence on multiple benchmark graphs, we highlight the importance of complementing uncertainty-based active learning models with an exploration term. MDPI 2020-10-16 /pmc/articles/PMC7597335/ /pubmed/33286933 http://dx.doi.org/10.3390/e22101164 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Madhawa, Kaushalya
Murata, Tsuyoshi
Active Learning for Node Classification: An Evaluation
title Active Learning for Node Classification: An Evaluation
title_full Active Learning for Node Classification: An Evaluation
title_fullStr Active Learning for Node Classification: An Evaluation
title_full_unstemmed Active Learning for Node Classification: An Evaluation
title_short Active Learning for Node Classification: An Evaluation
title_sort active learning for node classification: an evaluation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7597335/
https://www.ncbi.nlm.nih.gov/pubmed/33286933
http://dx.doi.org/10.3390/e22101164
work_keys_str_mv AT madhawakaushalya activelearningfornodeclassificationanevaluation
AT muratatsuyoshi activelearningfornodeclassificationanevaluation