Cargando…

Assemble the shallow or integrate a deep? Toward a lightweight solution for glyph-aware Chinese text classification

As hieroglyphic languages, such as Chinese, differ from alphabetic languages, researchers have always been interested in using internal glyph features to enhance semantic representation. However, the models used in such studies are becoming increasingly computationally expensive, even for simple tas...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hou, Jingrui, Wang, Ping
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2023
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10381045/ https://www.ncbi.nlm.nih.gov/pubmed/37506054 http://dx.doi.org/10.1371/journal.pone.0289204

_version_	1785080345752764416
author	Hou, Jingrui Wang, Ping
author_facet	Hou, Jingrui Wang, Ping
author_sort	Hou, Jingrui
collection	PubMed
description	As hieroglyphic languages, such as Chinese, differ from alphabetic languages, researchers have always been interested in using internal glyph features to enhance semantic representation. However, the models used in such studies are becoming increasingly computationally expensive, even for simple tasks like text classification. In this paper, we aim to balance model performance and computation cost in glyph-aware Chinese text classification tasks. To address this issue, we propose a lightweight ensemble learning method for glyph-aware Chinese text classification (LEGACT) that consists of typical shallow networks as base learners and machine learning classifiers as meta-learners. Through model design and a series of experiments, we demonstrate that an ensemble approach integrating shallow neural networks can achieve comparable results even when compared to large-scale transformer models. The contribution of this paper includes a lightweight yet powerful solution for glyph-aware Chinese text classification and empirical evidence of the significance of glyph features for hieroglyphic language representation. Moreover, this paper emphasizes the importance of assembling shallow neural networks with proper ensemble strategies to reduce computational workload in predictive tasks.
format	Online Article Text
id	pubmed-10381045
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-103810452023-07-29 Assemble the shallow or integrate a deep? Toward a lightweight solution for glyph-aware Chinese text classification Hou, Jingrui Wang, Ping PLoS One Research Article As hieroglyphic languages, such as Chinese, differ from alphabetic languages, researchers have always been interested in using internal glyph features to enhance semantic representation. However, the models used in such studies are becoming increasingly computationally expensive, even for simple tasks like text classification. In this paper, we aim to balance model performance and computation cost in glyph-aware Chinese text classification tasks. To address this issue, we propose a lightweight ensemble learning method for glyph-aware Chinese text classification (LEGACT) that consists of typical shallow networks as base learners and machine learning classifiers as meta-learners. Through model design and a series of experiments, we demonstrate that an ensemble approach integrating shallow neural networks can achieve comparable results even when compared to large-scale transformer models. The contribution of this paper includes a lightweight yet powerful solution for glyph-aware Chinese text classification and empirical evidence of the significance of glyph features for hieroglyphic language representation. Moreover, this paper emphasizes the importance of assembling shallow neural networks with proper ensemble strategies to reduce computational workload in predictive tasks. Public Library of Science 2023-07-28 /pmc/articles/PMC10381045/ /pubmed/37506054 http://dx.doi.org/10.1371/journal.pone.0289204 Text en © 2023 Hou, Wang https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Hou, Jingrui Wang, Ping Assemble the shallow or integrate a deep? Toward a lightweight solution for glyph-aware Chinese text classification
title	Assemble the shallow or integrate a deep? Toward a lightweight solution for glyph-aware Chinese text classification
title_full	Assemble the shallow or integrate a deep? Toward a lightweight solution for glyph-aware Chinese text classification
title_fullStr	Assemble the shallow or integrate a deep? Toward a lightweight solution for glyph-aware Chinese text classification
title_full_unstemmed	Assemble the shallow or integrate a deep? Toward a lightweight solution for glyph-aware Chinese text classification
title_short	Assemble the shallow or integrate a deep? Toward a lightweight solution for glyph-aware Chinese text classification
title_sort	assemble the shallow or integrate a deep? toward a lightweight solution for glyph-aware chinese text classification
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10381045/ https://www.ncbi.nlm.nih.gov/pubmed/37506054 http://dx.doi.org/10.1371/journal.pone.0289204
work_keys_str_mv	AT houjingrui assembletheshalloworintegrateadeeptowardalightweightsolutionforglyphawarechinesetextclassification AT wangping assembletheshalloworintegrateadeeptowardalightweightsolutionforglyphawarechinesetextclassification

Assemble the shallow or integrate a deep? Toward a lightweight solution for glyph-aware Chinese text classification

Ejemplares similares