Cargando…

Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning()

Question classification is a crucial task for answer selection. Question classification could help define the structure of question sentences generated by features extraction from a sentence, such as who, when, where, and how. In this paper, we proposed a methodology to improve question classificati...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chotirat, Saranlita, Meesad, Phayung
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8554172/ https://www.ncbi.nlm.nih.gov/pubmed/34746470 http://dx.doi.org/10.1016/j.heliyon.2021.e08216

_version_	1784591737259294720
author	Chotirat, Saranlita Meesad, Phayung
author_facet	Chotirat, Saranlita Meesad, Phayung
author_sort	Chotirat, Saranlita
collection	PubMed
description	Question classification is a crucial task for answer selection. Question classification could help define the structure of question sentences generated by features extraction from a sentence, such as who, when, where, and how. In this paper, we proposed a methodology to improve question classification from texts by using feature selection and word embedding techniques. We conducted several experiments to evaluate the performance of the proposed methodology using two different datasets (TREC-6 dataset and Thai sentence dataset) with term frequency and combined term frequency-inverse document frequency including Unigram, Unigram+Bigram, and Unigram + Trigram as features. Machine learning models based on traditional and deep learning classifiers were used. The traditional classification models were Multinomial Naïve Bayes, Logistic Regression, and Support Vector Machine. The deep learning techniques were Bidirectional Long Short-Term Memory (BiLSTM), Convolutional Neural Networks (CNN), and Hybrid model, which combined CNN and BiLSTM model. The experiment results showed that our methodology based on Part-of-Speech (POS) tagging was the best to improve question classification accuracy. The classifying question categories achieved with average micro [Formula: see text]-score of 0.98 when applied SVM model on adding all POS tags in the TREC-6 dataset. The highest average micro [Formula: see text]-score achieved 0.8 when applied GloVe by using CNN model on adding focusing tags in the Thai sentences dataset.
format	Online Article Text
id	pubmed-8554172
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-85541722021-11-05 Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning() Chotirat, Saranlita Meesad, Phayung Heliyon Research Article Question classification is a crucial task for answer selection. Question classification could help define the structure of question sentences generated by features extraction from a sentence, such as who, when, where, and how. In this paper, we proposed a methodology to improve question classification from texts by using feature selection and word embedding techniques. We conducted several experiments to evaluate the performance of the proposed methodology using two different datasets (TREC-6 dataset and Thai sentence dataset) with term frequency and combined term frequency-inverse document frequency including Unigram, Unigram+Bigram, and Unigram + Trigram as features. Machine learning models based on traditional and deep learning classifiers were used. The traditional classification models were Multinomial Naïve Bayes, Logistic Regression, and Support Vector Machine. The deep learning techniques were Bidirectional Long Short-Term Memory (BiLSTM), Convolutional Neural Networks (CNN), and Hybrid model, which combined CNN and BiLSTM model. The experiment results showed that our methodology based on Part-of-Speech (POS) tagging was the best to improve question classification accuracy. The classifying question categories achieved with average micro [Formula: see text]-score of 0.98 when applied SVM model on adding all POS tags in the TREC-6 dataset. The highest average micro [Formula: see text]-score achieved 0.8 when applied GloVe by using CNN model on adding focusing tags in the Thai sentences dataset. Elsevier 2021-10-19 /pmc/articles/PMC8554172/ /pubmed/34746470 http://dx.doi.org/10.1016/j.heliyon.2021.e08216 Text en © 2021 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Research Article Chotirat, Saranlita Meesad, Phayung Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning()
title	Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning()
title_full	Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning()
title_fullStr	Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning()
title_full_unstemmed	Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning()
title_short	Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning()
title_sort	part-of-speech tagging enhancement to natural language processing for thai wh-question classification with deep learning()
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8554172/ https://www.ncbi.nlm.nih.gov/pubmed/34746470 http://dx.doi.org/10.1016/j.heliyon.2021.e08216
work_keys_str_mv	AT chotiratsaranlita partofspeechtaggingenhancementtonaturallanguageprocessingforthaiwhquestionclassificationwithdeeplearning AT meesadphayung partofspeechtaggingenhancementtonaturallanguageprocessingforthaiwhquestionclassificationwithdeeplearning

Part-of-Speech tagging enhancement to natural language processing for Thai wh-question classification with deep learning()

Ejemplares similares