Cargando…

An Improved Math Word Problem (MWP) Model Using Unified Pretrained Language Model (UniLM) for Pretraining

Natural Language Understanding (NLU) and Natural Language Generation (NLG) are the general methods that support machine understanding of text content. They play a very important role in the text information processing system including recommendation and question and answer systems. There are many re...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Dongqiu, Li, Wenkui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9303081/
https://www.ncbi.nlm.nih.gov/pubmed/35875782
http://dx.doi.org/10.1155/2022/7468286
Descripción
Sumario:Natural Language Understanding (NLU) and Natural Language Generation (NLG) are the general methods that support machine understanding of text content. They play a very important role in the text information processing system including recommendation and question and answer systems. There are many researches in the field of NLU such as Bag of words, N-Gram, and neural network language model. These models have achieved a good performance in NLU and NLG tasks. However, since they require lots of training data, it is difficult to obtain rich data in practical applications. Thus, pretraining becomes important. This paper proposes a semisupervised way to deal with math word problem (MWP) tasks using unsupervised pretraining and supervised tuning methods, which are based on the Unified pretrained Language Model (UniLM). The proposed model requires fewer training data than traditional models since it uses model parameters of tasks that have been learned before to initialize the model parameters of new tasks. In this way, old knowledge helps new models successfully perform new tasks from old experiences instead of from scratch. Moreover, in order to help the decoder make accurate predictions, we combine the advantages of AR and AE language models to support one-way, sequence-to-sequence, and two-way predictions. Experiments, carried out on MWP tasks with 20,000+ mathematical questions, show that the improved model outperforms the traditional models with a maximum accuracy of 79.57%. The impact of different experiment parameters is also studied in the paper and we found that a wrong arithmetic order leads to incorrect solution expression generation.