Cargando…

Classifying multi-level product categories using dynamic masking and transformer models

In an online shopping platform, a detailed categorization of the products greatly enhances user navigation. Online retailers also benefit from well-defined product categories as various sales and marketing operations such as special discounts and promotions can be easily done over a set of product c...

Descripción completa

Detalles Bibliográficos
Autores principales: Ozyegen, Ozan, Jahanshahi, Hadi, Cevik, Mucahit, Bulut, Beste, Yigit, Deniz, Gonen, Fahrettin F., Başar, Ayşe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9019541/
http://dx.doi.org/10.1007/s42488-022-00066-6
_version_ 1784689307580104704
author Ozyegen, Ozan
Jahanshahi, Hadi
Cevik, Mucahit
Bulut, Beste
Yigit, Deniz
Gonen, Fahrettin F.
Başar, Ayşe
author_facet Ozyegen, Ozan
Jahanshahi, Hadi
Cevik, Mucahit
Bulut, Beste
Yigit, Deniz
Gonen, Fahrettin F.
Başar, Ayşe
author_sort Ozyegen, Ozan
collection PubMed
description In an online shopping platform, a detailed categorization of the products greatly enhances user navigation. Online retailers also benefit from well-defined product categories as various sales and marketing operations such as special discounts and promotions can be easily done over a set of product categories. Furthermore, incorrect and subjective product categories suggested by an operator can be more easily identified thanks to an automated classification system. In this study, we investigate the task of classifying grocery product categories using product titles. We employ a wide variety of text classification models for this task, including traditional machine learning and deep learning models as well as state-of-the-art transformer models. In our analysis, we specifically focus on the generalizability of the trained classification models to the products of other online retailers, the dynamic masking of infeasible subcategories for pretrained language models, and the impact of incorporating different word embeddings. We observe that the deep learning models and the transformers significantly outperform traditional text classification methods such as XGBoost and SVM, and achieve excellent prediction performance exceeding 90% accuracy and F1-score values. We lastly explore the failure cases where a product is misclassified, and make recommendations for future studies to improve the prediction performance.
format Online
Article
Text
id pubmed-9019541
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-90195412022-04-20 Classifying multi-level product categories using dynamic masking and transformer models Ozyegen, Ozan Jahanshahi, Hadi Cevik, Mucahit Bulut, Beste Yigit, Deniz Gonen, Fahrettin F. Başar, Ayşe J. of Data, Inf. and Manag. Original Article In an online shopping platform, a detailed categorization of the products greatly enhances user navigation. Online retailers also benefit from well-defined product categories as various sales and marketing operations such as special discounts and promotions can be easily done over a set of product categories. Furthermore, incorrect and subjective product categories suggested by an operator can be more easily identified thanks to an automated classification system. In this study, we investigate the task of classifying grocery product categories using product titles. We employ a wide variety of text classification models for this task, including traditional machine learning and deep learning models as well as state-of-the-art transformer models. In our analysis, we specifically focus on the generalizability of the trained classification models to the products of other online retailers, the dynamic masking of infeasible subcategories for pretrained language models, and the impact of incorporating different word embeddings. We observe that the deep learning models and the transformers significantly outperform traditional text classification methods such as XGBoost and SVM, and achieve excellent prediction performance exceeding 90% accuracy and F1-score values. We lastly explore the failure cases where a product is misclassified, and make recommendations for future studies to improve the prediction performance. Springer International Publishing 2022-04-20 2022 /pmc/articles/PMC9019541/ http://dx.doi.org/10.1007/s42488-022-00066-6 Text en © The Author(s), under exclusive licence to Springer Nature Switzerland AG 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Article
Ozyegen, Ozan
Jahanshahi, Hadi
Cevik, Mucahit
Bulut, Beste
Yigit, Deniz
Gonen, Fahrettin F.
Başar, Ayşe
Classifying multi-level product categories using dynamic masking and transformer models
title Classifying multi-level product categories using dynamic masking and transformer models
title_full Classifying multi-level product categories using dynamic masking and transformer models
title_fullStr Classifying multi-level product categories using dynamic masking and transformer models
title_full_unstemmed Classifying multi-level product categories using dynamic masking and transformer models
title_short Classifying multi-level product categories using dynamic masking and transformer models
title_sort classifying multi-level product categories using dynamic masking and transformer models
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9019541/
http://dx.doi.org/10.1007/s42488-022-00066-6
work_keys_str_mv AT ozyegenozan classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels
AT jahanshahihadi classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels
AT cevikmucahit classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels
AT bulutbeste classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels
AT yigitdeniz classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels
AT gonenfahrettinf classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels
AT basarayse classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels