Cargando…
A Review of Multi-Modal Learning from the Text-Guided Visual Processing Viewpoint
For decades, co-relating different data domains to attain the maximum potential of machines has driven research, especially in neural networks. Similarly, text and visual data (images and videos) are two distinct data domains with extensive research in the past. Recently, using natural language to p...
Autores principales: | Ullah, Ubaid, Lee, Jeong-Sik, An, Chang-Hyeon, Lee, Hyeonjin, Park, Su-Yeong, Baek, Rock-Hyun, Choi, Hyun-Chul |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9503702/ https://www.ncbi.nlm.nih.gov/pubmed/36146161 http://dx.doi.org/10.3390/s22186816 |
Ejemplares similares
-
Learning-Based Ordering Characters on Ancient Document
por: Lee, Hyeonjin, et al.
Publicado: (2022) -
Arbitrary Font Generation by Encoder Learning of Disentangled Features
por: Lee, Jeong-Sik, et al.
Publicado: (2022) -
Cross-Modal Object Recognition Is Viewpoint-Independent
por: Lacey, Simon, et al.
Publicado: (2007) -
Statistical and Visual Analysis of Audio, Text, and Image Features for Multi-Modal Music Genre Recognition
por: Wilkes, Ben, et al.
Publicado: (2021) -
Multi-View Visual Question Answering with Active Viewpoint Selection
por: Qiu, Yue, et al.
Publicado: (2020)