Cargando…

PME: pruning-based multi-size embedding for recommender systems

Embedding is widely used in recommendation models to learn feature representations. However, the traditional embedding technique that assigns a fixed size to all categorical features may be suboptimal due to the following reasons. In recommendation domain, the majority of categorical features'...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Zirui, Song, Qingquan, Li, Li, Choi, Soo-Hyun, Chen, Rui, Hu, Xia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311001/
https://www.ncbi.nlm.nih.gov/pubmed/37397622
http://dx.doi.org/10.3389/fdata.2023.1195742
_version_ 1785066652000321536
author Liu, Zirui
Song, Qingquan
Li, Li
Choi, Soo-Hyun
Chen, Rui
Hu, Xia
author_facet Liu, Zirui
Song, Qingquan
Li, Li
Choi, Soo-Hyun
Chen, Rui
Hu, Xia
author_sort Liu, Zirui
collection PubMed
description Embedding is widely used in recommendation models to learn feature representations. However, the traditional embedding technique that assigns a fixed size to all categorical features may be suboptimal due to the following reasons. In recommendation domain, the majority of categorical features' embeddings can be trained with less capacity without impacting model performance, thereby storing embeddings with equal length may incur unnecessary memory usage. Existing work that tries to allocate customized sizes for each feature usually either simply scales the embedding size with feature's popularity or formulates this size allocation problem as an architecture selection problem. Unfortunately, most of these methods either have large performance drop or incur significant extra time cost for searching proper embedding sizes. In this article, instead of formulating the size allocation problem as an architecture selection problem, we approach the problem from a pruning perspective and propose Pruning-based Multi-size Embedding (PME) framework. During the search phase, we prune the dimensions that have the least impact on model performance in the embedding to reduce its capacity. Then, we show that the customized size of each token can be obtained by transferring the capacity of its pruned embedding with significant less search cost. Experimental results validate that PME can efficiently find proper sizes and hence achieve strong performance while significantly reducing the number of parameters in the embedding layer.
format Online
Article
Text
id pubmed-10311001
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-103110012023-07-01 PME: pruning-based multi-size embedding for recommender systems Liu, Zirui Song, Qingquan Li, Li Choi, Soo-Hyun Chen, Rui Hu, Xia Front Big Data Big Data Embedding is widely used in recommendation models to learn feature representations. However, the traditional embedding technique that assigns a fixed size to all categorical features may be suboptimal due to the following reasons. In recommendation domain, the majority of categorical features' embeddings can be trained with less capacity without impacting model performance, thereby storing embeddings with equal length may incur unnecessary memory usage. Existing work that tries to allocate customized sizes for each feature usually either simply scales the embedding size with feature's popularity or formulates this size allocation problem as an architecture selection problem. Unfortunately, most of these methods either have large performance drop or incur significant extra time cost for searching proper embedding sizes. In this article, instead of formulating the size allocation problem as an architecture selection problem, we approach the problem from a pruning perspective and propose Pruning-based Multi-size Embedding (PME) framework. During the search phase, we prune the dimensions that have the least impact on model performance in the embedding to reduce its capacity. Then, we show that the customized size of each token can be obtained by transferring the capacity of its pruned embedding with significant less search cost. Experimental results validate that PME can efficiently find proper sizes and hence achieve strong performance while significantly reducing the number of parameters in the embedding layer. Frontiers Media S.A. 2023-06-15 /pmc/articles/PMC10311001/ /pubmed/37397622 http://dx.doi.org/10.3389/fdata.2023.1195742 Text en Copyright © 2023 Liu, Song, Li, Choi, Chen and Hu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Big Data
Liu, Zirui
Song, Qingquan
Li, Li
Choi, Soo-Hyun
Chen, Rui
Hu, Xia
PME: pruning-based multi-size embedding for recommender systems
title PME: pruning-based multi-size embedding for recommender systems
title_full PME: pruning-based multi-size embedding for recommender systems
title_fullStr PME: pruning-based multi-size embedding for recommender systems
title_full_unstemmed PME: pruning-based multi-size embedding for recommender systems
title_short PME: pruning-based multi-size embedding for recommender systems
title_sort pme: pruning-based multi-size embedding for recommender systems
topic Big Data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311001/
https://www.ncbi.nlm.nih.gov/pubmed/37397622
http://dx.doi.org/10.3389/fdata.2023.1195742
work_keys_str_mv AT liuzirui pmepruningbasedmultisizeembeddingforrecommendersystems
AT songqingquan pmepruningbasedmultisizeembeddingforrecommendersystems
AT lili pmepruningbasedmultisizeembeddingforrecommendersystems
AT choisoohyun pmepruningbasedmultisizeembeddingforrecommendersystems
AT chenrui pmepruningbasedmultisizeembeddingforrecommendersystems
AT huxia pmepruningbasedmultisizeembeddingforrecommendersystems