Cargando…
DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation
U-Net has attained immense popularity owing to its performance in medical image segmentation. However, it cannot be modeled explicitly over remote dependencies. By contrast, the transformer can effectively capture remote dependencies by leveraging the self-attention (SA) of the encoder. Although SA,...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Lippincott Williams & Wilkins
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10545043/ https://www.ncbi.nlm.nih.gov/pubmed/37773842 http://dx.doi.org/10.1097/MD.0000000000035328 |
_version_ | 1785114595073982464 |
---|---|
author | Shen, Longfeng Wang, Qiong Zhang, Yingjie Qin, Fenglan Jin, Hengjun Zhao, Wei |
author_facet | Shen, Longfeng Wang, Qiong Zhang, Yingjie Qin, Fenglan Jin, Hengjun Zhao, Wei |
author_sort | Shen, Longfeng |
collection | PubMed |
description | U-Net has attained immense popularity owing to its performance in medical image segmentation. However, it cannot be modeled explicitly over remote dependencies. By contrast, the transformer can effectively capture remote dependencies by leveraging the self-attention (SA) of the encoder. Although SA, an important characteristic of the transformer, can find correlations between them based on the original data, secondary computational complexity might retard the processing rate of high-dimensional data (such as medical images). Furthermore, SA is limited because the correlation between samples is overlooked; thus, there is considerable scope for improvement. To this end, based on Swin-UNet, we introduce a dynamic selective attention mechanism for the convolution kernels. The weight of each convolution kernel is calculated to fuse the results dynamically. This attention mechanism permits each neuron to adaptively modify its receptive field size in response to multiscale input information. A local cross-channel interaction strategy without dimensionality reduction was introduced, which effectively eliminated the influence of downscaling on learning channel attention. Through suitable cross-channel interactions, model complexity can be significantly reduced while maintaining its performance. Subsequently, the global interaction between the encoder features is used to extract more fine-grained features. Simultaneously, the mixed loss function of the weighted cross-entropy loss and Dice loss is used to alleviate category imbalances and achieve better results when the sample number is unbalanced. We evaluated our proposed method on abdominal multiorgan segmentation and cardiac segmentation datasets, achieving Dice similarity coefficient and 95% Hausdorff distance metrics of 80.30 and 14.55%, respectively, on the Synapse dataset and Dice similarity coefficient metrics of 90.80 on the ACDC dataset. The experimental results show that our proposed method has good generalization ability and robustness, and it is a powerful tool for medical image segmentation. |
format | Online Article Text |
id | pubmed-10545043 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Lippincott Williams & Wilkins |
record_format | MEDLINE/PubMed |
spelling | pubmed-105450432023-10-03 DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation Shen, Longfeng Wang, Qiong Zhang, Yingjie Qin, Fenglan Jin, Hengjun Zhao, Wei Medicine (Baltimore) 4100 U-Net has attained immense popularity owing to its performance in medical image segmentation. However, it cannot be modeled explicitly over remote dependencies. By contrast, the transformer can effectively capture remote dependencies by leveraging the self-attention (SA) of the encoder. Although SA, an important characteristic of the transformer, can find correlations between them based on the original data, secondary computational complexity might retard the processing rate of high-dimensional data (such as medical images). Furthermore, SA is limited because the correlation between samples is overlooked; thus, there is considerable scope for improvement. To this end, based on Swin-UNet, we introduce a dynamic selective attention mechanism for the convolution kernels. The weight of each convolution kernel is calculated to fuse the results dynamically. This attention mechanism permits each neuron to adaptively modify its receptive field size in response to multiscale input information. A local cross-channel interaction strategy without dimensionality reduction was introduced, which effectively eliminated the influence of downscaling on learning channel attention. Through suitable cross-channel interactions, model complexity can be significantly reduced while maintaining its performance. Subsequently, the global interaction between the encoder features is used to extract more fine-grained features. Simultaneously, the mixed loss function of the weighted cross-entropy loss and Dice loss is used to alleviate category imbalances and achieve better results when the sample number is unbalanced. We evaluated our proposed method on abdominal multiorgan segmentation and cardiac segmentation datasets, achieving Dice similarity coefficient and 95% Hausdorff distance metrics of 80.30 and 14.55%, respectively, on the Synapse dataset and Dice similarity coefficient metrics of 90.80 on the ACDC dataset. The experimental results show that our proposed method has good generalization ability and robustness, and it is a powerful tool for medical image segmentation. Lippincott Williams & Wilkins 2023-09-29 /pmc/articles/PMC10545043/ /pubmed/37773842 http://dx.doi.org/10.1097/MD.0000000000035328 Text en Copyright © 2023 the Author(s). Published by Wolters Kluwer Health, Inc. https://creativecommons.org/licenses/by-nc/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial License 4.0 (CCBY-NC) (https://creativecommons.org/licenses/by-nc/4.0/) , where it is permissible to download, share, remix, transform, and buildup the work provided it is properly cited. The work cannot be used commercially without permission from the journal. |
spellingShingle | 4100 Shen, Longfeng Wang, Qiong Zhang, Yingjie Qin, Fenglan Jin, Hengjun Zhao, Wei DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation |
title | DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation |
title_full | DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation |
title_fullStr | DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation |
title_full_unstemmed | DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation |
title_short | DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation |
title_sort | dskca-unet: dynamic selective kernel channel attention for medical image segmentation |
topic | 4100 |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10545043/ https://www.ncbi.nlm.nih.gov/pubmed/37773842 http://dx.doi.org/10.1097/MD.0000000000035328 |
work_keys_str_mv | AT shenlongfeng dskcaunetdynamicselectivekernelchannelattentionformedicalimagesegmentation AT wangqiong dskcaunetdynamicselectivekernelchannelattentionformedicalimagesegmentation AT zhangyingjie dskcaunetdynamicselectivekernelchannelattentionformedicalimagesegmentation AT qinfenglan dskcaunetdynamicselectivekernelchannelattentionformedicalimagesegmentation AT jinhengjun dskcaunetdynamicselectivekernelchannelattentionformedicalimagesegmentation AT zhaowei dskcaunetdynamicselectivekernelchannelattentionformedicalimagesegmentation |