Cargando…

TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography

Optical coherence tomography (OCT) provides unique advantages in ophthalmic examinations owing to its noncontact, high-resolution, and noninvasive features, which have evolved into one of the most crucial modalities for identifying and evaluating retinal abnormalities. Segmentation of laminar struct...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yiheng, Li, Zhongliang, Nan, Nan, Wang, Xiangzhao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10146870/
https://www.ncbi.nlm.nih.gov/pubmed/37109505
http://dx.doi.org/10.3390/life13040976
_version_ 1785034682174275584
author Zhang, Yiheng
Li, Zhongliang
Nan, Nan
Wang, Xiangzhao
author_facet Zhang, Yiheng
Li, Zhongliang
Nan, Nan
Wang, Xiangzhao
author_sort Zhang, Yiheng
collection PubMed
description Optical coherence tomography (OCT) provides unique advantages in ophthalmic examinations owing to its noncontact, high-resolution, and noninvasive features, which have evolved into one of the most crucial modalities for identifying and evaluating retinal abnormalities. Segmentation of laminar structures and lesion tissues in retinal OCT images can provide quantitative information on retinal morphology and reliable guidance for clinical diagnosis and treatment. Convolutional neural networks (CNNs) have achieved success in various medical image segmentation tasks. However, the receptive field of convolution has inherent locality constraints, resulting in limitations of mainstream frameworks based on CNNs, which is still evident in recognizing the morphological changes of retina OCT. In this study, we proposed an end-to-end network, TranSegNet, which incorporates a hybrid encoder that combines the advantages of a lightweight vision transformer (ViT) and the U-shaped network. The CNN features under multiscale resolution are extracted based on the improved U-net backbone, and a ViT with the multi-head convolutional attention is introduced to capture the feature information in a global view, realizing accurate localization and segmentation of retinal layers and lesion tissues. The experimental results illustrate that hybrid CNN-ViT is a strong encoder for retinal OCT image segmentation tasks and the lightweight design reduces its parameter size and computational complexity while maintaining its outstanding performance. By applying TranSegNet to healthy and diseased retinal OCT datasets separately, TranSegNet demonstrated superior efficiency, accuracy, and robustness in the segmentation results of retinal layers and accumulated fluid than the four advanced segmentation methods, such as FCN, SegNet, Unet and TransUnet.
format Online
Article
Text
id pubmed-10146870
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-101468702023-04-29 TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography Zhang, Yiheng Li, Zhongliang Nan, Nan Wang, Xiangzhao Life (Basel) Article Optical coherence tomography (OCT) provides unique advantages in ophthalmic examinations owing to its noncontact, high-resolution, and noninvasive features, which have evolved into one of the most crucial modalities for identifying and evaluating retinal abnormalities. Segmentation of laminar structures and lesion tissues in retinal OCT images can provide quantitative information on retinal morphology and reliable guidance for clinical diagnosis and treatment. Convolutional neural networks (CNNs) have achieved success in various medical image segmentation tasks. However, the receptive field of convolution has inherent locality constraints, resulting in limitations of mainstream frameworks based on CNNs, which is still evident in recognizing the morphological changes of retina OCT. In this study, we proposed an end-to-end network, TranSegNet, which incorporates a hybrid encoder that combines the advantages of a lightweight vision transformer (ViT) and the U-shaped network. The CNN features under multiscale resolution are extracted based on the improved U-net backbone, and a ViT with the multi-head convolutional attention is introduced to capture the feature information in a global view, realizing accurate localization and segmentation of retinal layers and lesion tissues. The experimental results illustrate that hybrid CNN-ViT is a strong encoder for retinal OCT image segmentation tasks and the lightweight design reduces its parameter size and computational complexity while maintaining its outstanding performance. By applying TranSegNet to healthy and diseased retinal OCT datasets separately, TranSegNet demonstrated superior efficiency, accuracy, and robustness in the segmentation results of retinal layers and accumulated fluid than the four advanced segmentation methods, such as FCN, SegNet, Unet and TransUnet. MDPI 2023-04-10 /pmc/articles/PMC10146870/ /pubmed/37109505 http://dx.doi.org/10.3390/life13040976 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhang, Yiheng
Li, Zhongliang
Nan, Nan
Wang, Xiangzhao
TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography
title TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography
title_full TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography
title_fullStr TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography
title_full_unstemmed TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography
title_short TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography
title_sort transegnet: hybrid cnn-vision transformers encoder for retina segmentation of optical coherence tomography
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10146870/
https://www.ncbi.nlm.nih.gov/pubmed/37109505
http://dx.doi.org/10.3390/life13040976
work_keys_str_mv AT zhangyiheng transegnethybridcnnvisiontransformersencoderforretinasegmentationofopticalcoherencetomography
AT lizhongliang transegnethybridcnnvisiontransformersencoderforretinasegmentationofopticalcoherencetomography
AT nannan transegnethybridcnnvisiontransformersencoderforretinasegmentationofopticalcoherencetomography
AT wangxiangzhao transegnethybridcnnvisiontransformersencoderforretinasegmentationofopticalcoherencetomography