Cargando…
TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography
Optical coherence tomography (OCT) provides unique advantages in ophthalmic examinations owing to its noncontact, high-resolution, and noninvasive features, which have evolved into one of the most crucial modalities for identifying and evaluating retinal abnormalities. Segmentation of laminar struct...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10146870/ https://www.ncbi.nlm.nih.gov/pubmed/37109505 http://dx.doi.org/10.3390/life13040976 |
_version_ | 1785034682174275584 |
---|---|
author | Zhang, Yiheng Li, Zhongliang Nan, Nan Wang, Xiangzhao |
author_facet | Zhang, Yiheng Li, Zhongliang Nan, Nan Wang, Xiangzhao |
author_sort | Zhang, Yiheng |
collection | PubMed |
description | Optical coherence tomography (OCT) provides unique advantages in ophthalmic examinations owing to its noncontact, high-resolution, and noninvasive features, which have evolved into one of the most crucial modalities for identifying and evaluating retinal abnormalities. Segmentation of laminar structures and lesion tissues in retinal OCT images can provide quantitative information on retinal morphology and reliable guidance for clinical diagnosis and treatment. Convolutional neural networks (CNNs) have achieved success in various medical image segmentation tasks. However, the receptive field of convolution has inherent locality constraints, resulting in limitations of mainstream frameworks based on CNNs, which is still evident in recognizing the morphological changes of retina OCT. In this study, we proposed an end-to-end network, TranSegNet, which incorporates a hybrid encoder that combines the advantages of a lightweight vision transformer (ViT) and the U-shaped network. The CNN features under multiscale resolution are extracted based on the improved U-net backbone, and a ViT with the multi-head convolutional attention is introduced to capture the feature information in a global view, realizing accurate localization and segmentation of retinal layers and lesion tissues. The experimental results illustrate that hybrid CNN-ViT is a strong encoder for retinal OCT image segmentation tasks and the lightweight design reduces its parameter size and computational complexity while maintaining its outstanding performance. By applying TranSegNet to healthy and diseased retinal OCT datasets separately, TranSegNet demonstrated superior efficiency, accuracy, and robustness in the segmentation results of retinal layers and accumulated fluid than the four advanced segmentation methods, such as FCN, SegNet, Unet and TransUnet. |
format | Online Article Text |
id | pubmed-10146870 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-101468702023-04-29 TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography Zhang, Yiheng Li, Zhongliang Nan, Nan Wang, Xiangzhao Life (Basel) Article Optical coherence tomography (OCT) provides unique advantages in ophthalmic examinations owing to its noncontact, high-resolution, and noninvasive features, which have evolved into one of the most crucial modalities for identifying and evaluating retinal abnormalities. Segmentation of laminar structures and lesion tissues in retinal OCT images can provide quantitative information on retinal morphology and reliable guidance for clinical diagnosis and treatment. Convolutional neural networks (CNNs) have achieved success in various medical image segmentation tasks. However, the receptive field of convolution has inherent locality constraints, resulting in limitations of mainstream frameworks based on CNNs, which is still evident in recognizing the morphological changes of retina OCT. In this study, we proposed an end-to-end network, TranSegNet, which incorporates a hybrid encoder that combines the advantages of a lightweight vision transformer (ViT) and the U-shaped network. The CNN features under multiscale resolution are extracted based on the improved U-net backbone, and a ViT with the multi-head convolutional attention is introduced to capture the feature information in a global view, realizing accurate localization and segmentation of retinal layers and lesion tissues. The experimental results illustrate that hybrid CNN-ViT is a strong encoder for retinal OCT image segmentation tasks and the lightweight design reduces its parameter size and computational complexity while maintaining its outstanding performance. By applying TranSegNet to healthy and diseased retinal OCT datasets separately, TranSegNet demonstrated superior efficiency, accuracy, and robustness in the segmentation results of retinal layers and accumulated fluid than the four advanced segmentation methods, such as FCN, SegNet, Unet and TransUnet. MDPI 2023-04-10 /pmc/articles/PMC10146870/ /pubmed/37109505 http://dx.doi.org/10.3390/life13040976 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Zhang, Yiheng Li, Zhongliang Nan, Nan Wang, Xiangzhao TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography |
title | TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography |
title_full | TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography |
title_fullStr | TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography |
title_full_unstemmed | TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography |
title_short | TranSegNet: Hybrid CNN-Vision Transformers Encoder for Retina Segmentation of Optical Coherence Tomography |
title_sort | transegnet: hybrid cnn-vision transformers encoder for retina segmentation of optical coherence tomography |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10146870/ https://www.ncbi.nlm.nih.gov/pubmed/37109505 http://dx.doi.org/10.3390/life13040976 |
work_keys_str_mv | AT zhangyiheng transegnethybridcnnvisiontransformersencoderforretinasegmentationofopticalcoherencetomography AT lizhongliang transegnethybridcnnvisiontransformersencoderforretinasegmentationofopticalcoherencetomography AT nannan transegnethybridcnnvisiontransformersencoderforretinasegmentationofopticalcoherencetomography AT wangxiangzhao transegnethybridcnnvisiontransformersencoderforretinasegmentationofopticalcoherencetomography |