Cargando…
Translation-invariant optical neural network for image classification
The classification performance of all-optical Convolutional Neural Networks (CNNs) is greatly influenced by components’ misalignment and translation of input images in the practical applications. In this paper, we propose a free-space all-optical CNN (named Trans-ONN) which accurately classifies tra...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9568607/ https://www.ncbi.nlm.nih.gov/pubmed/36241863 http://dx.doi.org/10.1038/s41598-022-22291-0 |
_version_ | 1784809677624705024 |
---|---|
author | Sadeghzadeh, Hoda Koohi, Somayyeh |
author_facet | Sadeghzadeh, Hoda Koohi, Somayyeh |
author_sort | Sadeghzadeh, Hoda |
collection | PubMed |
description | The classification performance of all-optical Convolutional Neural Networks (CNNs) is greatly influenced by components’ misalignment and translation of input images in the practical applications. In this paper, we propose a free-space all-optical CNN (named Trans-ONN) which accurately classifies translated images in the horizontal, vertical, or diagonal directions. Trans-ONN takes advantages of an optical motion pooling layer which provides the translation invariance property by implementing different optical masks in the Fourier plane for classifying translated test images. Moreover, to enhance the translation invariance property, global average pooling (GAP) is utilized in the Trans-ONN structure, rather than fully connected layers. The comparative studies confirm that taking advantage of vertical and horizontal masks along GAP operation provide the best translation invariance property, compared to the alternative network models, for classifying horizontally and vertically shifted test images up to 50 pixel shifts of Kaggle Cats and Dogs, CIFAR-10, and MNIST datasets, respectively. Also, adopting the diagonal mask along GAP operation achieves the best classification accuracy for classifying translated test images in the diagonal direction for large number of pixel shifts (i.e. more than 30 pixel shifts). It is worth mentioning that the proposed translation invariant networks are capable of classifying the translated test images not included in the training procedure. |
format | Online Article Text |
id | pubmed-9568607 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-95686072022-10-16 Translation-invariant optical neural network for image classification Sadeghzadeh, Hoda Koohi, Somayyeh Sci Rep Article The classification performance of all-optical Convolutional Neural Networks (CNNs) is greatly influenced by components’ misalignment and translation of input images in the practical applications. In this paper, we propose a free-space all-optical CNN (named Trans-ONN) which accurately classifies translated images in the horizontal, vertical, or diagonal directions. Trans-ONN takes advantages of an optical motion pooling layer which provides the translation invariance property by implementing different optical masks in the Fourier plane for classifying translated test images. Moreover, to enhance the translation invariance property, global average pooling (GAP) is utilized in the Trans-ONN structure, rather than fully connected layers. The comparative studies confirm that taking advantage of vertical and horizontal masks along GAP operation provide the best translation invariance property, compared to the alternative network models, for classifying horizontally and vertically shifted test images up to 50 pixel shifts of Kaggle Cats and Dogs, CIFAR-10, and MNIST datasets, respectively. Also, adopting the diagonal mask along GAP operation achieves the best classification accuracy for classifying translated test images in the diagonal direction for large number of pixel shifts (i.e. more than 30 pixel shifts). It is worth mentioning that the proposed translation invariant networks are capable of classifying the translated test images not included in the training procedure. Nature Publishing Group UK 2022-10-14 /pmc/articles/PMC9568607/ /pubmed/36241863 http://dx.doi.org/10.1038/s41598-022-22291-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Sadeghzadeh, Hoda Koohi, Somayyeh Translation-invariant optical neural network for image classification |
title | Translation-invariant optical neural network for image classification |
title_full | Translation-invariant optical neural network for image classification |
title_fullStr | Translation-invariant optical neural network for image classification |
title_full_unstemmed | Translation-invariant optical neural network for image classification |
title_short | Translation-invariant optical neural network for image classification |
title_sort | translation-invariant optical neural network for image classification |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9568607/ https://www.ncbi.nlm.nih.gov/pubmed/36241863 http://dx.doi.org/10.1038/s41598-022-22291-0 |
work_keys_str_mv | AT sadeghzadehhoda translationinvariantopticalneuralnetworkforimageclassification AT koohisomayyeh translationinvariantopticalneuralnetworkforimageclassification |