Cargando…

EmbedFormer: Embedded Depth-Wise Convolution Layer for Token Mixing

Visual Transformers (ViTs) have shown impressive performance due to their powerful coding ability to catch spatial and channel information. MetaFormer gives us a general architecture of transformers consisting of a token mixer and a channel mixer through which we can generally understand how transfo...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Zeji, He, Xiaowei, Li, Yi, Chuai, Qinliang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9782848/ https://www.ncbi.nlm.nih.gov/pubmed/36560222 http://dx.doi.org/10.3390/s22249854

Internet

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9782848/
https://www.ncbi.nlm.nih.gov/pubmed/36560222
http://dx.doi.org/10.3390/s22249854

EmbedFormer: Embedded Depth-Wise Convolution Layer for Token Mixing

Internet

Ejemplares similares