Cargando…
Visual-Text Reference Pretraining Model for Image Captioning
People can accurately describe an image by constantly referring to the visual information and key text information of the image. Inspired by this idea, we propose the VTR-PTM (Visual-Text Reference Pretraining Model) for image captioning. First, based on the pretraining model (BERT/UNIML), we design...
Autores principales: | Li, Pengfei, Zhang, Min, Lin, Peijie, Wan, Jian, Jiang, Ming |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8799330/ https://www.ncbi.nlm.nih.gov/pubmed/35096050 http://dx.doi.org/10.1155/2022/9400999 |
Ejemplares similares
-
Medical image captioning via generative pretrained transformers
por: Selivanov, Alexander, et al.
Publicado: (2023) -
Hotel Review Classification Based on the Text Pretraining Heterogeneous Graph Neural Network Model
por: Zhang, Liyan, et al.
Publicado: (2022) -
Comparison of Pretraining Models and Strategies for Health-Related Social Media Text Classification
por: Guo, Yuting, et al.
Publicado: (2022) -
An Improved Math Word Problem (MWP) Model Using Unified Pretrained Language Model (UniLM) for Pretraining
por: Zhang, Dongqiu, et al.
Publicado: (2022) -
To pretrain or not? A systematic analysis of the benefits of pretraining in diabetic retinopathy
por: Srinivasan, Vignesh, et al.
Publicado: (2022)