Cargando…

Metaformer: A Transformer That Tends to Mine Metaphorical-Level Information

Since introducing the Transformer model, it has dramatically influenced various fields of machine learning. The field of time series prediction has also been significantly impacted, where Transformer family models have flourished, and many variants have been differentiated. These Transformer models...

Descripción completa

Detalles Bibliográficos
Autores principales: Peng, Bo, Ding, Yuanming, Kang, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255808/
https://www.ncbi.nlm.nih.gov/pubmed/37299819
http://dx.doi.org/10.3390/s23115093
_version_ 1785056961791787008
author Peng, Bo
Ding, Yuanming
Kang, Wei
author_facet Peng, Bo
Ding, Yuanming
Kang, Wei
author_sort Peng, Bo
collection PubMed
description Since introducing the Transformer model, it has dramatically influenced various fields of machine learning. The field of time series prediction has also been significantly impacted, where Transformer family models have flourished, and many variants have been differentiated. These Transformer models mainly use attention mechanisms to implement feature extraction and multi-head attention mechanisms to enhance the strength of feature extraction. However, multi-head attention is essentially a simple superposition of the same attention, so they do not guarantee that the model can capture different features. Conversely, multi-head attention mechanisms may lead to much information redundancy and computational resource waste. In order to ensure that the Transformer can capture information from multiple perspectives and increase the diversity of its captured features, this paper proposes a hierarchical attention mechanism, for the first time, to improve the shortcomings of insufficient information diversity captured by the traditional multi-head attention mechanisms and the lack of information interaction among the heads. Additionally, global feature aggregation using graph networks is used to mitigate inductive bias. Finally, we conducted experiments on four benchmark datasets, and the experimental results show that the proposed model can outperform the baseline model in several metrics.
format Online
Article
Text
id pubmed-10255808
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102558082023-06-10 Metaformer: A Transformer That Tends to Mine Metaphorical-Level Information Peng, Bo Ding, Yuanming Kang, Wei Sensors (Basel) Article Since introducing the Transformer model, it has dramatically influenced various fields of machine learning. The field of time series prediction has also been significantly impacted, where Transformer family models have flourished, and many variants have been differentiated. These Transformer models mainly use attention mechanisms to implement feature extraction and multi-head attention mechanisms to enhance the strength of feature extraction. However, multi-head attention is essentially a simple superposition of the same attention, so they do not guarantee that the model can capture different features. Conversely, multi-head attention mechanisms may lead to much information redundancy and computational resource waste. In order to ensure that the Transformer can capture information from multiple perspectives and increase the diversity of its captured features, this paper proposes a hierarchical attention mechanism, for the first time, to improve the shortcomings of insufficient information diversity captured by the traditional multi-head attention mechanisms and the lack of information interaction among the heads. Additionally, global feature aggregation using graph networks is used to mitigate inductive bias. Finally, we conducted experiments on four benchmark datasets, and the experimental results show that the proposed model can outperform the baseline model in several metrics. MDPI 2023-05-26 /pmc/articles/PMC10255808/ /pubmed/37299819 http://dx.doi.org/10.3390/s23115093 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Peng, Bo
Ding, Yuanming
Kang, Wei
Metaformer: A Transformer That Tends to Mine Metaphorical-Level Information
title Metaformer: A Transformer That Tends to Mine Metaphorical-Level Information
title_full Metaformer: A Transformer That Tends to Mine Metaphorical-Level Information
title_fullStr Metaformer: A Transformer That Tends to Mine Metaphorical-Level Information
title_full_unstemmed Metaformer: A Transformer That Tends to Mine Metaphorical-Level Information
title_short Metaformer: A Transformer That Tends to Mine Metaphorical-Level Information
title_sort metaformer: a transformer that tends to mine metaphorical-level information
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255808/
https://www.ncbi.nlm.nih.gov/pubmed/37299819
http://dx.doi.org/10.3390/s23115093
work_keys_str_mv AT pengbo metaformeratransformerthattendstominemetaphoricallevelinformation
AT dingyuanming metaformeratransformerthattendstominemetaphoricallevelinformation
AT kangwei metaformeratransformerthattendstominemetaphoricallevelinformation