Cargando…

G2Basy: A framework to improve the RNN language model and ease overfitting problem

Recurrent neural networks are efficient ways of training language models, and various RNN networks have been proposed to improve performance. However, with the increase of network scales, the overfitting problem becomes more urgent. In this paper, we propose a framework—G2Basy—to speed up the traini...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yuwen, Lu, Chen, Shuyu, Yuan, Xiaohan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8046238/ https://www.ncbi.nlm.nih.gov/pubmed/33852595 http://dx.doi.org/10.1371/journal.pone.0249820

_version_	1783678810479132672
author	Yuwen, Lu Chen, Shuyu Yuan, Xiaohan
author_facet	Yuwen, Lu Chen, Shuyu Yuan, Xiaohan
author_sort	Yuwen, Lu
collection	PubMed
description	Recurrent neural networks are efficient ways of training language models, and various RNN networks have been proposed to improve performance. However, with the increase of network scales, the overfitting problem becomes more urgent. In this paper, we propose a framework—G2Basy—to speed up the training process and ease the overfitting problem. Instead of using predefined hyperparameters, we devise a gradient increasing and decreasing technique that changes the parameters training batch size and input dropout simultaneously by a user-defined step size. Together with a pretrained word embedding initialization procedure and the introduction of different optimizers at different learning rates, our framework speeds up the training process dramatically and improves performance compared with a benchmark model of the same scale. For the word embedding initialization, we propose the concept of “artificial features” to describe the characteristics of the obtained word embeddings. We experiment on two of the most often used corpora—the Penn Treebank and WikiText-2 datasets—and both outperform the benchmark results and show potential towards further improvement. Furthermore, our framework shows better results with the larger and more complicated WikiText-2 corpus than with the Penn Treebank. Compared with other state-of-the-art results, we achieve comparable results with network scales hundreds of times smaller and within fewer training epochs.
format	Online Article Text
id	pubmed-8046238
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-80462382021-04-21 G2Basy: A framework to improve the RNN language model and ease overfitting problem Yuwen, Lu Chen, Shuyu Yuan, Xiaohan PLoS One Research Article Recurrent neural networks are efficient ways of training language models, and various RNN networks have been proposed to improve performance. However, with the increase of network scales, the overfitting problem becomes more urgent. In this paper, we propose a framework—G2Basy—to speed up the training process and ease the overfitting problem. Instead of using predefined hyperparameters, we devise a gradient increasing and decreasing technique that changes the parameters training batch size and input dropout simultaneously by a user-defined step size. Together with a pretrained word embedding initialization procedure and the introduction of different optimizers at different learning rates, our framework speeds up the training process dramatically and improves performance compared with a benchmark model of the same scale. For the word embedding initialization, we propose the concept of “artificial features” to describe the characteristics of the obtained word embeddings. We experiment on two of the most often used corpora—the Penn Treebank and WikiText-2 datasets—and both outperform the benchmark results and show potential towards further improvement. Furthermore, our framework shows better results with the larger and more complicated WikiText-2 corpus than with the Penn Treebank. Compared with other state-of-the-art results, we achieve comparable results with network scales hundreds of times smaller and within fewer training epochs. Public Library of Science 2021-04-14 /pmc/articles/PMC8046238/ /pubmed/33852595 http://dx.doi.org/10.1371/journal.pone.0249820 Text en © 2021 Yuwen et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Yuwen, Lu Chen, Shuyu Yuan, Xiaohan G2Basy: A framework to improve the RNN language model and ease overfitting problem
title	G2Basy: A framework to improve the RNN language model and ease overfitting problem
title_full	G2Basy: A framework to improve the RNN language model and ease overfitting problem
title_fullStr	G2Basy: A framework to improve the RNN language model and ease overfitting problem
title_full_unstemmed	G2Basy: A framework to improve the RNN language model and ease overfitting problem
title_short	G2Basy: A framework to improve the RNN language model and ease overfitting problem
title_sort	g2basy: a framework to improve the rnn language model and ease overfitting problem
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8046238/ https://www.ncbi.nlm.nih.gov/pubmed/33852595 http://dx.doi.org/10.1371/journal.pone.0249820
work_keys_str_mv	AT yuwenlu g2basyaframeworktoimprovethernnlanguagemodelandeaseoverfittingproblem AT chenshuyu g2basyaframeworktoimprovethernnlanguagemodelandeaseoverfittingproblem AT yuanxiaohan g2basyaframeworktoimprovethernnlanguagemodelandeaseoverfittingproblem

G2Basy: A framework to improve the RNN language model and ease overfitting problem

Ejemplares similares