Cargando…
Transformer and group parallel axial attention co-encoder for medical image segmentation
U-Net has become baseline standard in the medical image segmentation tasks, but it has limitations in explicitly modeling long-term dependencies. Transformer has the ability to capture long-term relevance through its internal self-attention. However, Transformer is committed to modeling the correlat...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9515122/ https://www.ncbi.nlm.nih.gov/pubmed/36167743 http://dx.doi.org/10.1038/s41598-022-20440-z |
_version_ | 1784798422286467072 |
---|---|
author | Li, Chaoqun Wang, Liejun Li, Yongming |
author_facet | Li, Chaoqun Wang, Liejun Li, Yongming |
author_sort | Li, Chaoqun |
collection | PubMed |
description | U-Net has become baseline standard in the medical image segmentation tasks, but it has limitations in explicitly modeling long-term dependencies. Transformer has the ability to capture long-term relevance through its internal self-attention. However, Transformer is committed to modeling the correlation of all elements, but its awareness of local foreground information is not significant. Since medical images are often presented as regional blocks, local information is equally important. In this paper, we propose the GPA-TUNet by considering local and global information synthetically. Specifically, we propose a new attention mechanism to highlight local foreground information, called group parallel axial attention (GPA). Furthermore, we effectively combine GPA with Transformer in encoder part of model. It can not only highlight the foreground information of samples, but also reduce the negative influence of background information on the segmentation results. Meanwhile, we introduced the sMLP block to improve the global modeling capability of network. Sparse connectivity and weight sharing are well achieved by applying it. Extensive experiments on public datasets confirm the excellent performance of our proposed GPA-TUNet. In particular, on Synapse and ACDC datasets, mean DSC(%) reached 80.37% and 90.37% respectively, mean HD95(mm) reached 20.55 and 1.23 respectively. |
format | Online Article Text |
id | pubmed-9515122 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-95151222022-09-29 Transformer and group parallel axial attention co-encoder for medical image segmentation Li, Chaoqun Wang, Liejun Li, Yongming Sci Rep Article U-Net has become baseline standard in the medical image segmentation tasks, but it has limitations in explicitly modeling long-term dependencies. Transformer has the ability to capture long-term relevance through its internal self-attention. However, Transformer is committed to modeling the correlation of all elements, but its awareness of local foreground information is not significant. Since medical images are often presented as regional blocks, local information is equally important. In this paper, we propose the GPA-TUNet by considering local and global information synthetically. Specifically, we propose a new attention mechanism to highlight local foreground information, called group parallel axial attention (GPA). Furthermore, we effectively combine GPA with Transformer in encoder part of model. It can not only highlight the foreground information of samples, but also reduce the negative influence of background information on the segmentation results. Meanwhile, we introduced the sMLP block to improve the global modeling capability of network. Sparse connectivity and weight sharing are well achieved by applying it. Extensive experiments on public datasets confirm the excellent performance of our proposed GPA-TUNet. In particular, on Synapse and ACDC datasets, mean DSC(%) reached 80.37% and 90.37% respectively, mean HD95(mm) reached 20.55 and 1.23 respectively. Nature Publishing Group UK 2022-09-27 /pmc/articles/PMC9515122/ /pubmed/36167743 http://dx.doi.org/10.1038/s41598-022-20440-z Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Li, Chaoqun Wang, Liejun Li, Yongming Transformer and group parallel axial attention co-encoder for medical image segmentation |
title | Transformer and group parallel axial attention co-encoder for medical image segmentation |
title_full | Transformer and group parallel axial attention co-encoder for medical image segmentation |
title_fullStr | Transformer and group parallel axial attention co-encoder for medical image segmentation |
title_full_unstemmed | Transformer and group parallel axial attention co-encoder for medical image segmentation |
title_short | Transformer and group parallel axial attention co-encoder for medical image segmentation |
title_sort | transformer and group parallel axial attention co-encoder for medical image segmentation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9515122/ https://www.ncbi.nlm.nih.gov/pubmed/36167743 http://dx.doi.org/10.1038/s41598-022-20440-z |
work_keys_str_mv | AT lichaoqun transformerandgroupparallelaxialattentioncoencoderformedicalimagesegmentation AT wangliejun transformerandgroupparallelaxialattentioncoencoderformedicalimagesegmentation AT liyongming transformerandgroupparallelaxialattentioncoencoderformedicalimagesegmentation |