Cargando…

DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation

U-Net has attained immense popularity owing to its performance in medical image segmentation. However, it cannot be modeled explicitly over remote dependencies. By contrast, the transformer can effectively capture remote dependencies by leveraging the self-attention (SA) of the encoder. Although SA,...

Descripción completa

Detalles Bibliográficos
Autores principales: Shen, Longfeng, Wang, Qiong, Zhang, Yingjie, Qin, Fenglan, Jin, Hengjun, Zhao, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Lippincott Williams & Wilkins 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10545043/
https://www.ncbi.nlm.nih.gov/pubmed/37773842
http://dx.doi.org/10.1097/MD.0000000000035328
_version_ 1785114595073982464
author Shen, Longfeng
Wang, Qiong
Zhang, Yingjie
Qin, Fenglan
Jin, Hengjun
Zhao, Wei
author_facet Shen, Longfeng
Wang, Qiong
Zhang, Yingjie
Qin, Fenglan
Jin, Hengjun
Zhao, Wei
author_sort Shen, Longfeng
collection PubMed
description U-Net has attained immense popularity owing to its performance in medical image segmentation. However, it cannot be modeled explicitly over remote dependencies. By contrast, the transformer can effectively capture remote dependencies by leveraging the self-attention (SA) of the encoder. Although SA, an important characteristic of the transformer, can find correlations between them based on the original data, secondary computational complexity might retard the processing rate of high-dimensional data (such as medical images). Furthermore, SA is limited because the correlation between samples is overlooked; thus, there is considerable scope for improvement. To this end, based on Swin-UNet, we introduce a dynamic selective attention mechanism for the convolution kernels. The weight of each convolution kernel is calculated to fuse the results dynamically. This attention mechanism permits each neuron to adaptively modify its receptive field size in response to multiscale input information. A local cross-channel interaction strategy without dimensionality reduction was introduced, which effectively eliminated the influence of downscaling on learning channel attention. Through suitable cross-channel interactions, model complexity can be significantly reduced while maintaining its performance. Subsequently, the global interaction between the encoder features is used to extract more fine-grained features. Simultaneously, the mixed loss function of the weighted cross-entropy loss and Dice loss is used to alleviate category imbalances and achieve better results when the sample number is unbalanced. We evaluated our proposed method on abdominal multiorgan segmentation and cardiac segmentation datasets, achieving Dice similarity coefficient and 95% Hausdorff distance metrics of 80.30 and 14.55%, respectively, on the Synapse dataset and Dice similarity coefficient metrics of 90.80 on the ACDC dataset. The experimental results show that our proposed method has good generalization ability and robustness, and it is a powerful tool for medical image segmentation.
format Online
Article
Text
id pubmed-10545043
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Lippincott Williams & Wilkins
record_format MEDLINE/PubMed
spelling pubmed-105450432023-10-03 DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation Shen, Longfeng Wang, Qiong Zhang, Yingjie Qin, Fenglan Jin, Hengjun Zhao, Wei Medicine (Baltimore) 4100 U-Net has attained immense popularity owing to its performance in medical image segmentation. However, it cannot be modeled explicitly over remote dependencies. By contrast, the transformer can effectively capture remote dependencies by leveraging the self-attention (SA) of the encoder. Although SA, an important characteristic of the transformer, can find correlations between them based on the original data, secondary computational complexity might retard the processing rate of high-dimensional data (such as medical images). Furthermore, SA is limited because the correlation between samples is overlooked; thus, there is considerable scope for improvement. To this end, based on Swin-UNet, we introduce a dynamic selective attention mechanism for the convolution kernels. The weight of each convolution kernel is calculated to fuse the results dynamically. This attention mechanism permits each neuron to adaptively modify its receptive field size in response to multiscale input information. A local cross-channel interaction strategy without dimensionality reduction was introduced, which effectively eliminated the influence of downscaling on learning channel attention. Through suitable cross-channel interactions, model complexity can be significantly reduced while maintaining its performance. Subsequently, the global interaction between the encoder features is used to extract more fine-grained features. Simultaneously, the mixed loss function of the weighted cross-entropy loss and Dice loss is used to alleviate category imbalances and achieve better results when the sample number is unbalanced. We evaluated our proposed method on abdominal multiorgan segmentation and cardiac segmentation datasets, achieving Dice similarity coefficient and 95% Hausdorff distance metrics of 80.30 and 14.55%, respectively, on the Synapse dataset and Dice similarity coefficient metrics of 90.80 on the ACDC dataset. The experimental results show that our proposed method has good generalization ability and robustness, and it is a powerful tool for medical image segmentation. Lippincott Williams & Wilkins 2023-09-29 /pmc/articles/PMC10545043/ /pubmed/37773842 http://dx.doi.org/10.1097/MD.0000000000035328 Text en Copyright © 2023 the Author(s). Published by Wolters Kluwer Health, Inc. https://creativecommons.org/licenses/by-nc/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial License 4.0 (CCBY-NC) (https://creativecommons.org/licenses/by-nc/4.0/) , where it is permissible to download, share, remix, transform, and buildup the work provided it is properly cited. The work cannot be used commercially without permission from the journal.
spellingShingle 4100
Shen, Longfeng
Wang, Qiong
Zhang, Yingjie
Qin, Fenglan
Jin, Hengjun
Zhao, Wei
DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation
title DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation
title_full DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation
title_fullStr DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation
title_full_unstemmed DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation
title_short DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation
title_sort dskca-unet: dynamic selective kernel channel attention for medical image segmentation
topic 4100
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10545043/
https://www.ncbi.nlm.nih.gov/pubmed/37773842
http://dx.doi.org/10.1097/MD.0000000000035328
work_keys_str_mv AT shenlongfeng dskcaunetdynamicselectivekernelchannelattentionformedicalimagesegmentation
AT wangqiong dskcaunetdynamicselectivekernelchannelattentionformedicalimagesegmentation
AT zhangyingjie dskcaunetdynamicselectivekernelchannelattentionformedicalimagesegmentation
AT qinfenglan dskcaunetdynamicselectivekernelchannelattentionformedicalimagesegmentation
AT jinhengjun dskcaunetdynamicselectivekernelchannelattentionformedicalimagesegmentation
AT zhaowei dskcaunetdynamicselectivekernelchannelattentionformedicalimagesegmentation