Cargando…

Auxiliary signal-guided knowledge encoder-decoder for medical report generation

Medical reports have significant clinical value to radiologists and specialists, especially during a pandemic like COVID. However, beyond the common difficulties faced in the natural image captioning, medical report generation specifically requires the model to describe a medical image with a fine-g...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Mingjie, Liu, Rui, Wang, Fuyu, Chang, Xiaojun, Liang, Xiaodan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9417931/
https://www.ncbi.nlm.nih.gov/pubmed/36060430
http://dx.doi.org/10.1007/s11280-022-01013-6
_version_ 1784776833913323520
author Li, Mingjie
Liu, Rui
Wang, Fuyu
Chang, Xiaojun
Liang, Xiaodan
author_facet Li, Mingjie
Liu, Rui
Wang, Fuyu
Chang, Xiaojun
Liang, Xiaodan
author_sort Li, Mingjie
collection PubMed
description Medical reports have significant clinical value to radiologists and specialists, especially during a pandemic like COVID. However, beyond the common difficulties faced in the natural image captioning, medical report generation specifically requires the model to describe a medical image with a fine-grained and semantic-coherence paragraph that should satisfy both medical commonsense and logic. Previous works generally extract the global image features and attempt to generate a paragraph that is similar to referenced reports; however, this approach has two limitations. Firstly, the regions of primary interest to radiologists are usually located in a small area of the global image, meaning that the remainder parts of the image could be considered as irrelevant noise in the training procedure. Secondly, there are many similar sentences used in each medical report to describe the normal regions of the image, which causes serious data bias. This deviation is likely to teach models to generate these inessential sentences on a regular basis. To address these problems, we propose an Auxiliary Signal-Guided Knowledge Encoder-Decoder (ASGK) to mimic radiologists’ working patterns. Specifically, the auxiliary patches are explored to expand the widely used visual patch features before fed to the Transformer encoder, while the external linguistic signals help the decoder better master prior knowledge during the pre-training process. Our approach performs well on common benchmarks, including CX-CHR, IU X-Ray, and COVID-19 CT Report dataset (COV-CTR), demonstrating combining auxiliary signals with transformer architecture can bring a significant improvement in terms of medical report generation. The experimental results confirm that auxiliary signals driven Transformer-based models are with solid capabilities to outperform previous approaches on both medical terminology classification and paragraph generation metrics.
format Online
Article
Text
id pubmed-9417931
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-94179312022-08-30 Auxiliary signal-guided knowledge encoder-decoder for medical report generation Li, Mingjie Liu, Rui Wang, Fuyu Chang, Xiaojun Liang, Xiaodan World Wide Web Article Medical reports have significant clinical value to radiologists and specialists, especially during a pandemic like COVID. However, beyond the common difficulties faced in the natural image captioning, medical report generation specifically requires the model to describe a medical image with a fine-grained and semantic-coherence paragraph that should satisfy both medical commonsense and logic. Previous works generally extract the global image features and attempt to generate a paragraph that is similar to referenced reports; however, this approach has two limitations. Firstly, the regions of primary interest to radiologists are usually located in a small area of the global image, meaning that the remainder parts of the image could be considered as irrelevant noise in the training procedure. Secondly, there are many similar sentences used in each medical report to describe the normal regions of the image, which causes serious data bias. This deviation is likely to teach models to generate these inessential sentences on a regular basis. To address these problems, we propose an Auxiliary Signal-Guided Knowledge Encoder-Decoder (ASGK) to mimic radiologists’ working patterns. Specifically, the auxiliary patches are explored to expand the widely used visual patch features before fed to the Transformer encoder, while the external linguistic signals help the decoder better master prior knowledge during the pre-training process. Our approach performs well on common benchmarks, including CX-CHR, IU X-Ray, and COVID-19 CT Report dataset (COV-CTR), demonstrating combining auxiliary signals with transformer architecture can bring a significant improvement in terms of medical report generation. The experimental results confirm that auxiliary signals driven Transformer-based models are with solid capabilities to outperform previous approaches on both medical terminology classification and paragraph generation metrics. Springer US 2022-08-27 2023 /pmc/articles/PMC9417931/ /pubmed/36060430 http://dx.doi.org/10.1007/s11280-022-01013-6 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Li, Mingjie
Liu, Rui
Wang, Fuyu
Chang, Xiaojun
Liang, Xiaodan
Auxiliary signal-guided knowledge encoder-decoder for medical report generation
title Auxiliary signal-guided knowledge encoder-decoder for medical report generation
title_full Auxiliary signal-guided knowledge encoder-decoder for medical report generation
title_fullStr Auxiliary signal-guided knowledge encoder-decoder for medical report generation
title_full_unstemmed Auxiliary signal-guided knowledge encoder-decoder for medical report generation
title_short Auxiliary signal-guided knowledge encoder-decoder for medical report generation
title_sort auxiliary signal-guided knowledge encoder-decoder for medical report generation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9417931/
https://www.ncbi.nlm.nih.gov/pubmed/36060430
http://dx.doi.org/10.1007/s11280-022-01013-6
work_keys_str_mv AT limingjie auxiliarysignalguidedknowledgeencoderdecoderformedicalreportgeneration
AT liurui auxiliarysignalguidedknowledgeencoderdecoderformedicalreportgeneration
AT wangfuyu auxiliarysignalguidedknowledgeencoderdecoderformedicalreportgeneration
AT changxiaojun auxiliarysignalguidedknowledgeencoderdecoderformedicalreportgeneration
AT liangxiaodan auxiliarysignalguidedknowledgeencoderdecoderformedicalreportgeneration