Cargando…
Classifying multi-level product categories using dynamic masking and transformer models
In an online shopping platform, a detailed categorization of the products greatly enhances user navigation. Online retailers also benefit from well-defined product categories as various sales and marketing operations such as special discounts and promotions can be easily done over a set of product c...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9019541/ http://dx.doi.org/10.1007/s42488-022-00066-6 |
_version_ | 1784689307580104704 |
---|---|
author | Ozyegen, Ozan Jahanshahi, Hadi Cevik, Mucahit Bulut, Beste Yigit, Deniz Gonen, Fahrettin F. Başar, Ayşe |
author_facet | Ozyegen, Ozan Jahanshahi, Hadi Cevik, Mucahit Bulut, Beste Yigit, Deniz Gonen, Fahrettin F. Başar, Ayşe |
author_sort | Ozyegen, Ozan |
collection | PubMed |
description | In an online shopping platform, a detailed categorization of the products greatly enhances user navigation. Online retailers also benefit from well-defined product categories as various sales and marketing operations such as special discounts and promotions can be easily done over a set of product categories. Furthermore, incorrect and subjective product categories suggested by an operator can be more easily identified thanks to an automated classification system. In this study, we investigate the task of classifying grocery product categories using product titles. We employ a wide variety of text classification models for this task, including traditional machine learning and deep learning models as well as state-of-the-art transformer models. In our analysis, we specifically focus on the generalizability of the trained classification models to the products of other online retailers, the dynamic masking of infeasible subcategories for pretrained language models, and the impact of incorporating different word embeddings. We observe that the deep learning models and the transformers significantly outperform traditional text classification methods such as XGBoost and SVM, and achieve excellent prediction performance exceeding 90% accuracy and F1-score values. We lastly explore the failure cases where a product is misclassified, and make recommendations for future studies to improve the prediction performance. |
format | Online Article Text |
id | pubmed-9019541 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-90195412022-04-20 Classifying multi-level product categories using dynamic masking and transformer models Ozyegen, Ozan Jahanshahi, Hadi Cevik, Mucahit Bulut, Beste Yigit, Deniz Gonen, Fahrettin F. Başar, Ayşe J. of Data, Inf. and Manag. Original Article In an online shopping platform, a detailed categorization of the products greatly enhances user navigation. Online retailers also benefit from well-defined product categories as various sales and marketing operations such as special discounts and promotions can be easily done over a set of product categories. Furthermore, incorrect and subjective product categories suggested by an operator can be more easily identified thanks to an automated classification system. In this study, we investigate the task of classifying grocery product categories using product titles. We employ a wide variety of text classification models for this task, including traditional machine learning and deep learning models as well as state-of-the-art transformer models. In our analysis, we specifically focus on the generalizability of the trained classification models to the products of other online retailers, the dynamic masking of infeasible subcategories for pretrained language models, and the impact of incorporating different word embeddings. We observe that the deep learning models and the transformers significantly outperform traditional text classification methods such as XGBoost and SVM, and achieve excellent prediction performance exceeding 90% accuracy and F1-score values. We lastly explore the failure cases where a product is misclassified, and make recommendations for future studies to improve the prediction performance. Springer International Publishing 2022-04-20 2022 /pmc/articles/PMC9019541/ http://dx.doi.org/10.1007/s42488-022-00066-6 Text en © The Author(s), under exclusive licence to Springer Nature Switzerland AG 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Article Ozyegen, Ozan Jahanshahi, Hadi Cevik, Mucahit Bulut, Beste Yigit, Deniz Gonen, Fahrettin F. Başar, Ayşe Classifying multi-level product categories using dynamic masking and transformer models |
title | Classifying multi-level product categories using dynamic masking and transformer models |
title_full | Classifying multi-level product categories using dynamic masking and transformer models |
title_fullStr | Classifying multi-level product categories using dynamic masking and transformer models |
title_full_unstemmed | Classifying multi-level product categories using dynamic masking and transformer models |
title_short | Classifying multi-level product categories using dynamic masking and transformer models |
title_sort | classifying multi-level product categories using dynamic masking and transformer models |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9019541/ http://dx.doi.org/10.1007/s42488-022-00066-6 |
work_keys_str_mv | AT ozyegenozan classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels AT jahanshahihadi classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels AT cevikmucahit classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels AT bulutbeste classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels AT yigitdeniz classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels AT gonenfahrettinf classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels AT basarayse classifyingmultilevelproductcategoriesusingdynamicmaskingandtransformermodels |