Cargando…

Multimodal image translation via deep learning inference model trained in video domain

BACKGROUND: Current medical image translation is implemented in the image domain. Considering the medical image acquisition is essentially a temporally continuous process, we attempt to develop a novel image translation framework via deep learning trained in video domain for generating synthesized c...

Descripción completa

Detalles Bibliográficos
Autores principales: Fan, Jiawei, Liu, Zhiqiang, Yang, Dong, Qiao, Jian, Zhao, Jun, Wang, Jiazhou, Hu, Weigang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9281162/
https://www.ncbi.nlm.nih.gov/pubmed/35836126
http://dx.doi.org/10.1186/s12880-022-00854-x
_version_ 1784746818863628288
author Fan, Jiawei
Liu, Zhiqiang
Yang, Dong
Qiao, Jian
Zhao, Jun
Wang, Jiazhou
Hu, Weigang
author_facet Fan, Jiawei
Liu, Zhiqiang
Yang, Dong
Qiao, Jian
Zhao, Jun
Wang, Jiazhou
Hu, Weigang
author_sort Fan, Jiawei
collection PubMed
description BACKGROUND: Current medical image translation is implemented in the image domain. Considering the medical image acquisition is essentially a temporally continuous process, we attempt to develop a novel image translation framework via deep learning trained in video domain for generating synthesized computed tomography (CT) images from cone-beam computed tomography (CBCT) images. METHODS: For a proof-of-concept demonstration, CBCT and CT images from 100 patients were collected to demonstrate the feasibility and reliability of the proposed framework. The CBCT and CT images were further registered as paired samples and used as the input data for the supervised model training. A vid2vid framework based on the conditional GAN network, with carefully-designed generators, discriminators and a new spatio-temporal learning objective, was applied to realize the CBCT–CT image translation in the video domain. Four evaluation metrics, including mean absolute error (MAE), peak signal-to-noise ratio (PSNR), normalized cross-correlation (NCC), and structural similarity (SSIM), were calculated on all the real and synthetic CT images from 10 new testing patients to illustrate the model performance. RESULTS: The average values for four evaluation metrics, including MAE, PSNR, NCC, and SSIM, are 23.27 ± 5.53, 32.67 ± 1.98, 0.99 ± 0.0059, and 0.97 ± 0.028, respectively. Most of the pixel-wise hounsfield units value differences between real and synthetic CT images are within 50. The synthetic CT images have great agreement with the real CT images and the image quality is improved with lower noise and artifacts compared with CBCT images. CONCLUSIONS: We developed a deep-learning-based approach to perform the medical image translation problem in the video domain. Although the feasibility and reliability of the proposed framework were demonstrated by CBCT–CT image translation, it can be easily extended to other types of medical images. The current results illustrate that it is a very promising method that may pave a new path for medical image translation research.
format Online
Article
Text
id pubmed-9281162
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-92811622022-07-15 Multimodal image translation via deep learning inference model trained in video domain Fan, Jiawei Liu, Zhiqiang Yang, Dong Qiao, Jian Zhao, Jun Wang, Jiazhou Hu, Weigang BMC Med Imaging Research BACKGROUND: Current medical image translation is implemented in the image domain. Considering the medical image acquisition is essentially a temporally continuous process, we attempt to develop a novel image translation framework via deep learning trained in video domain for generating synthesized computed tomography (CT) images from cone-beam computed tomography (CBCT) images. METHODS: For a proof-of-concept demonstration, CBCT and CT images from 100 patients were collected to demonstrate the feasibility and reliability of the proposed framework. The CBCT and CT images were further registered as paired samples and used as the input data for the supervised model training. A vid2vid framework based on the conditional GAN network, with carefully-designed generators, discriminators and a new spatio-temporal learning objective, was applied to realize the CBCT–CT image translation in the video domain. Four evaluation metrics, including mean absolute error (MAE), peak signal-to-noise ratio (PSNR), normalized cross-correlation (NCC), and structural similarity (SSIM), were calculated on all the real and synthetic CT images from 10 new testing patients to illustrate the model performance. RESULTS: The average values for four evaluation metrics, including MAE, PSNR, NCC, and SSIM, are 23.27 ± 5.53, 32.67 ± 1.98, 0.99 ± 0.0059, and 0.97 ± 0.028, respectively. Most of the pixel-wise hounsfield units value differences between real and synthetic CT images are within 50. The synthetic CT images have great agreement with the real CT images and the image quality is improved with lower noise and artifacts compared with CBCT images. CONCLUSIONS: We developed a deep-learning-based approach to perform the medical image translation problem in the video domain. Although the feasibility and reliability of the proposed framework were demonstrated by CBCT–CT image translation, it can be easily extended to other types of medical images. The current results illustrate that it is a very promising method that may pave a new path for medical image translation research. BioMed Central 2022-07-14 /pmc/articles/PMC9281162/ /pubmed/35836126 http://dx.doi.org/10.1186/s12880-022-00854-x Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Fan, Jiawei
Liu, Zhiqiang
Yang, Dong
Qiao, Jian
Zhao, Jun
Wang, Jiazhou
Hu, Weigang
Multimodal image translation via deep learning inference model trained in video domain
title Multimodal image translation via deep learning inference model trained in video domain
title_full Multimodal image translation via deep learning inference model trained in video domain
title_fullStr Multimodal image translation via deep learning inference model trained in video domain
title_full_unstemmed Multimodal image translation via deep learning inference model trained in video domain
title_short Multimodal image translation via deep learning inference model trained in video domain
title_sort multimodal image translation via deep learning inference model trained in video domain
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9281162/
https://www.ncbi.nlm.nih.gov/pubmed/35836126
http://dx.doi.org/10.1186/s12880-022-00854-x
work_keys_str_mv AT fanjiawei multimodalimagetranslationviadeeplearninginferencemodeltrainedinvideodomain
AT liuzhiqiang multimodalimagetranslationviadeeplearninginferencemodeltrainedinvideodomain
AT yangdong multimodalimagetranslationviadeeplearninginferencemodeltrainedinvideodomain
AT qiaojian multimodalimagetranslationviadeeplearninginferencemodeltrainedinvideodomain
AT zhaojun multimodalimagetranslationviadeeplearninginferencemodeltrainedinvideodomain
AT wangjiazhou multimodalimagetranslationviadeeplearninginferencemodeltrainedinvideodomain
AT huweigang multimodalimagetranslationviadeeplearninginferencemodeltrainedinvideodomain