Cargando…

Are we ready for a new paradigm shift? A survey on visual deep MLP

Recently, the proposed deep multilayer perceptron (MLP) models have stirred up a lot of interest in the vision community. Historically, the availability of larger datasets combined with increased computing capacity led to paradigm shifts. This review provides detailed discussions on whether MLPs can...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Ruiyang, Li, Yinghui, Tao, Linmi, Liang, Dun, Zheng, Hai-Tao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9278509/
https://www.ncbi.nlm.nih.gov/pubmed/35845841
http://dx.doi.org/10.1016/j.patter.2022.100520
_version_ 1784746202247462912
author Liu, Ruiyang
Li, Yinghui
Tao, Linmi
Liang, Dun
Zheng, Hai-Tao
author_facet Liu, Ruiyang
Li, Yinghui
Tao, Linmi
Liang, Dun
Zheng, Hai-Tao
author_sort Liu, Ruiyang
collection PubMed
description Recently, the proposed deep multilayer perceptron (MLP) models have stirred up a lot of interest in the vision community. Historically, the availability of larger datasets combined with increased computing capacity led to paradigm shifts. This review provides detailed discussions on whether MLPs can be a new paradigm for computer vision. We compare the intrinsic connections and differences between convolution, self-attention mechanism, and token-mixing MLP in detail. Advantages and limitations of token-mixing MLP are provided, followed by careful analysis of recent MLP-like variants, from module design to network architecture, and their applications. In the graphics processing unit era, the locally and globally weighted summations are the current mainstreams, represented by the convolution and self-attention mechanism, as well as MLPs. We suggest the further development of the paradigm to be considered alongside the next-generation computing devices.
format Online
Article
Text
id pubmed-9278509
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-92785092022-07-14 Are we ready for a new paradigm shift? A survey on visual deep MLP Liu, Ruiyang Li, Yinghui Tao, Linmi Liang, Dun Zheng, Hai-Tao Patterns (N Y) Review Recently, the proposed deep multilayer perceptron (MLP) models have stirred up a lot of interest in the vision community. Historically, the availability of larger datasets combined with increased computing capacity led to paradigm shifts. This review provides detailed discussions on whether MLPs can be a new paradigm for computer vision. We compare the intrinsic connections and differences between convolution, self-attention mechanism, and token-mixing MLP in detail. Advantages and limitations of token-mixing MLP are provided, followed by careful analysis of recent MLP-like variants, from module design to network architecture, and their applications. In the graphics processing unit era, the locally and globally weighted summations are the current mainstreams, represented by the convolution and self-attention mechanism, as well as MLPs. We suggest the further development of the paradigm to be considered alongside the next-generation computing devices. Elsevier 2022-07-08 /pmc/articles/PMC9278509/ /pubmed/35845841 http://dx.doi.org/10.1016/j.patter.2022.100520 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Review
Liu, Ruiyang
Li, Yinghui
Tao, Linmi
Liang, Dun
Zheng, Hai-Tao
Are we ready for a new paradigm shift? A survey on visual deep MLP
title Are we ready for a new paradigm shift? A survey on visual deep MLP
title_full Are we ready for a new paradigm shift? A survey on visual deep MLP
title_fullStr Are we ready for a new paradigm shift? A survey on visual deep MLP
title_full_unstemmed Are we ready for a new paradigm shift? A survey on visual deep MLP
title_short Are we ready for a new paradigm shift? A survey on visual deep MLP
title_sort are we ready for a new paradigm shift? a survey on visual deep mlp
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9278509/
https://www.ncbi.nlm.nih.gov/pubmed/35845841
http://dx.doi.org/10.1016/j.patter.2022.100520
work_keys_str_mv AT liuruiyang arewereadyforanewparadigmshiftasurveyonvisualdeepmlp
AT liyinghui arewereadyforanewparadigmshiftasurveyonvisualdeepmlp
AT taolinmi arewereadyforanewparadigmshiftasurveyonvisualdeepmlp
AT liangdun arewereadyforanewparadigmshiftasurveyonvisualdeepmlp
AT zhenghaitao arewereadyforanewparadigmshiftasurveyonvisualdeepmlp