Cargando…

Robust fusion for skin lesion segmentation of dermoscopic images

Robust skin lesion segmentation of dermoscopic images is still very difficult. Recent methods often take the combinations of CNN and Transformer for feature abstraction and multi-scale features for further classification. Both types of combination in general rely on some forms of feature fusion. Thi...

Descripción completa

Detalles Bibliográficos
Autores principales: Guo, Qingqing, Fang, Xianyong, Wang, Linbo, Zhang, Enming, Liu, Zhengyi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10069440/
https://www.ncbi.nlm.nih.gov/pubmed/37020509
http://dx.doi.org/10.3389/fbioe.2023.1057866
Descripción
Sumario:Robust skin lesion segmentation of dermoscopic images is still very difficult. Recent methods often take the combinations of CNN and Transformer for feature abstraction and multi-scale features for further classification. Both types of combination in general rely on some forms of feature fusion. This paper considers these fusions from two novel points of view. For abstraction, Transformer is viewed as the affinity exploration of different patch tokens and can be applied to attend CNN features in multiple scales. Consequently, a new fusion module, the Attention-based Transformer-And-CNN fusion module (ATAC), is proposed. ATAC augments the CNN features with more global contexts. For further classification, adaptively combining the information from multiple scales according to their contributions to object recognition is expected. Accordingly, a new fusion module, the GAting-based Multi-Scale fusion module (GAMS), is also introduced, which adaptively weights the information from multiple scales by the light-weighted gating mechanism. Combining ATAC and GAMS leads to a new encoder-decoder-based framework. In this method, ATAC acts as an encoder block to progressively abstract strong CNN features with rich global contexts attended by long-range relations, while GAMS works as an enhancement of the decoder to generate the discriminative features through adaptive fusion of multi-scale ones. This framework is especially good at lesions of varying sizes and shapes and of low contrasts and its performances are demonstrated with extensive experiments on public skin lesion segmentation datasets.