Cargando…

Construction of English and American Literature Corpus Based on Machine Learning Algorithm

In China, the application of corpus in language teaching, especially in English and American literature teaching, is still in the preliminary research stage, and there are various shortcomings, which have not been paid due attention by front-line educators. Constructing English and American literatu...

Descripción completa

Detalles Bibliográficos
Autor principal:	Dai, Qian
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9184167/ https://www.ncbi.nlm.nih.gov/pubmed/35694598 http://dx.doi.org/10.1155/2022/9773452

_version_	1784724450913026048
author	Dai, Qian
author_facet	Dai, Qian
author_sort	Dai, Qian
collection	PubMed
description	In China, the application of corpus in language teaching, especially in English and American literature teaching, is still in the preliminary research stage, and there are various shortcomings, which have not been paid due attention by front-line educators. Constructing English and American literature corpus according to certain principles can effectively promote English and American literature teaching. The research of this paper is devoted to how to automatically build a corpus of English and American literature. In the process of keyword extraction, key phrases and keywords are effectively combined. The similarity between atomic events is calculated by the TextRank algorithm, and then the first N sentences with high similarity are selected and sorted. Based on ML (machine learning) text classification method, a combined classifier based on SVM (support vector machine) and NB (Naive Bayes) is proposed. The experimental results show that, from the point of view of accuracy and recall, the classification effect of the combined algorithm proposed in this paper is the best among the three methods. The best classification results of accuracy, recall, and F value are 0.87, 0.9, and 0.89, respectively. Experimental results show that this method can quickly, accurately, and persistently obtain high-quality bilingual mixed web pages.
format	Online Article Text
id	pubmed-9184167
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-91841672022-06-10 Construction of English and American Literature Corpus Based on Machine Learning Algorithm Dai, Qian Comput Intell Neurosci Research Article In China, the application of corpus in language teaching, especially in English and American literature teaching, is still in the preliminary research stage, and there are various shortcomings, which have not been paid due attention by front-line educators. Constructing English and American literature corpus according to certain principles can effectively promote English and American literature teaching. The research of this paper is devoted to how to automatically build a corpus of English and American literature. In the process of keyword extraction, key phrases and keywords are effectively combined. The similarity between atomic events is calculated by the TextRank algorithm, and then the first N sentences with high similarity are selected and sorted. Based on ML (machine learning) text classification method, a combined classifier based on SVM (support vector machine) and NB (Naive Bayes) is proposed. The experimental results show that, from the point of view of accuracy and recall, the classification effect of the combined algorithm proposed in this paper is the best among the three methods. The best classification results of accuracy, recall, and F value are 0.87, 0.9, and 0.89, respectively. Experimental results show that this method can quickly, accurately, and persistently obtain high-quality bilingual mixed web pages. Hindawi 2022-06-02 /pmc/articles/PMC9184167/ /pubmed/35694598 http://dx.doi.org/10.1155/2022/9773452 Text en Copyright © 2022 Qian Dai. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Dai, Qian Construction of English and American Literature Corpus Based on Machine Learning Algorithm
title	Construction of English and American Literature Corpus Based on Machine Learning Algorithm
title_full	Construction of English and American Literature Corpus Based on Machine Learning Algorithm
title_fullStr	Construction of English and American Literature Corpus Based on Machine Learning Algorithm
title_full_unstemmed	Construction of English and American Literature Corpus Based on Machine Learning Algorithm
title_short	Construction of English and American Literature Corpus Based on Machine Learning Algorithm
title_sort	construction of english and american literature corpus based on machine learning algorithm
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9184167/ https://www.ncbi.nlm.nih.gov/pubmed/35694598 http://dx.doi.org/10.1155/2022/9773452
work_keys_str_mv	AT daiqian constructionofenglishandamericanliteraturecorpusbasedonmachinelearningalgorithm

Construction of English and American Literature Corpus Based on Machine Learning Algorithm

Ejemplares similares