Cargando…

Toward Exploiting Second-Order Feature Statistics for Arbitrary Image Style Transfer

Generating images of artistic style from input images, also known as image style transfer, has been improved in the quality of output style and the speed of image generation since deep neural networks have been applied in the field of computer vision research. However, the previous approaches used f...

Descripción completa

Detalles Bibliográficos
Autor principal: Choi, Hyun-Chul
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9003536/
https://www.ncbi.nlm.nih.gov/pubmed/35408228
http://dx.doi.org/10.3390/s22072611
Descripción
Sumario:Generating images of artistic style from input images, also known as image style transfer, has been improved in the quality of output style and the speed of image generation since deep neural networks have been applied in the field of computer vision research. However, the previous approaches used feature alignment techniques that were too simple in their transform layer to cover the characteristics of style features of images. In addition, they used an inconsistent combination of transform layers and loss functions in the training phase to embed arbitrary styles in a decoder network. To overcome these shortcomings, the second-order statistics of the encoded features are exploited to build an optimal arbitrary image style transfer technique. First, a new correlation-aware loss and a correlation-aware feature alignment technique are proposed. Using this consistent combination of loss and feature alignment methods strongly matches the second-order statistics of content features to those of the target-style features and, accordingly, the style capacity of the decoder network is increased. Secondly, a new component-wise style controlling method is proposed. This method can generate various styles from one or several style images by using style-specific components from second-order feature statistics. We experimentally prove that the proposed method achieves improvements in both the style capacity of the decoder network and the style variety without losing the ability of real-time processing (less than 200 ms) on Graphics Processing Unit (GPU) devices.