Cargando…
PME: pruning-based multi-size embedding for recommender systems
Embedding is widely used in recommendation models to learn feature representations. However, the traditional embedding technique that assigns a fixed size to all categorical features may be suboptimal due to the following reasons. In recommendation domain, the majority of categorical features'...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311001/ https://www.ncbi.nlm.nih.gov/pubmed/37397622 http://dx.doi.org/10.3389/fdata.2023.1195742 |
_version_ | 1785066652000321536 |
---|---|
author | Liu, Zirui Song, Qingquan Li, Li Choi, Soo-Hyun Chen, Rui Hu, Xia |
author_facet | Liu, Zirui Song, Qingquan Li, Li Choi, Soo-Hyun Chen, Rui Hu, Xia |
author_sort | Liu, Zirui |
collection | PubMed |
description | Embedding is widely used in recommendation models to learn feature representations. However, the traditional embedding technique that assigns a fixed size to all categorical features may be suboptimal due to the following reasons. In recommendation domain, the majority of categorical features' embeddings can be trained with less capacity without impacting model performance, thereby storing embeddings with equal length may incur unnecessary memory usage. Existing work that tries to allocate customized sizes for each feature usually either simply scales the embedding size with feature's popularity or formulates this size allocation problem as an architecture selection problem. Unfortunately, most of these methods either have large performance drop or incur significant extra time cost for searching proper embedding sizes. In this article, instead of formulating the size allocation problem as an architecture selection problem, we approach the problem from a pruning perspective and propose Pruning-based Multi-size Embedding (PME) framework. During the search phase, we prune the dimensions that have the least impact on model performance in the embedding to reduce its capacity. Then, we show that the customized size of each token can be obtained by transferring the capacity of its pruned embedding with significant less search cost. Experimental results validate that PME can efficiently find proper sizes and hence achieve strong performance while significantly reducing the number of parameters in the embedding layer. |
format | Online Article Text |
id | pubmed-10311001 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-103110012023-07-01 PME: pruning-based multi-size embedding for recommender systems Liu, Zirui Song, Qingquan Li, Li Choi, Soo-Hyun Chen, Rui Hu, Xia Front Big Data Big Data Embedding is widely used in recommendation models to learn feature representations. However, the traditional embedding technique that assigns a fixed size to all categorical features may be suboptimal due to the following reasons. In recommendation domain, the majority of categorical features' embeddings can be trained with less capacity without impacting model performance, thereby storing embeddings with equal length may incur unnecessary memory usage. Existing work that tries to allocate customized sizes for each feature usually either simply scales the embedding size with feature's popularity or formulates this size allocation problem as an architecture selection problem. Unfortunately, most of these methods either have large performance drop or incur significant extra time cost for searching proper embedding sizes. In this article, instead of formulating the size allocation problem as an architecture selection problem, we approach the problem from a pruning perspective and propose Pruning-based Multi-size Embedding (PME) framework. During the search phase, we prune the dimensions that have the least impact on model performance in the embedding to reduce its capacity. Then, we show that the customized size of each token can be obtained by transferring the capacity of its pruned embedding with significant less search cost. Experimental results validate that PME can efficiently find proper sizes and hence achieve strong performance while significantly reducing the number of parameters in the embedding layer. Frontiers Media S.A. 2023-06-15 /pmc/articles/PMC10311001/ /pubmed/37397622 http://dx.doi.org/10.3389/fdata.2023.1195742 Text en Copyright © 2023 Liu, Song, Li, Choi, Chen and Hu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Big Data Liu, Zirui Song, Qingquan Li, Li Choi, Soo-Hyun Chen, Rui Hu, Xia PME: pruning-based multi-size embedding for recommender systems |
title | PME: pruning-based multi-size embedding for recommender systems |
title_full | PME: pruning-based multi-size embedding for recommender systems |
title_fullStr | PME: pruning-based multi-size embedding for recommender systems |
title_full_unstemmed | PME: pruning-based multi-size embedding for recommender systems |
title_short | PME: pruning-based multi-size embedding for recommender systems |
title_sort | pme: pruning-based multi-size embedding for recommender systems |
topic | Big Data |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311001/ https://www.ncbi.nlm.nih.gov/pubmed/37397622 http://dx.doi.org/10.3389/fdata.2023.1195742 |
work_keys_str_mv | AT liuzirui pmepruningbasedmultisizeembeddingforrecommendersystems AT songqingquan pmepruningbasedmultisizeembeddingforrecommendersystems AT lili pmepruningbasedmultisizeembeddingforrecommendersystems AT choisoohyun pmepruningbasedmultisizeembeddingforrecommendersystems AT chenrui pmepruningbasedmultisizeembeddingforrecommendersystems AT huxia pmepruningbasedmultisizeembeddingforrecommendersystems |