Cargando…

End-to-End Sentence-Level Multi-View Lipreading Architecture with Spatial Attention Module Integrated Multiple CNNs and Cascaded Local Self-Attention-CTC

Concomitant with the recent advances in deep learning, automatic speech recognition and visual speech recognition (VSR) have received considerable attention. However, although VSR systems must identify speech from both frontal and profile faces in real-world scenarios, most VSR studies have focused...

Descripción completa

Detalles Bibliográficos
Autores principales: Jeon, Sanghun, Kim, Mun Sang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9099765/
https://www.ncbi.nlm.nih.gov/pubmed/35591284
http://dx.doi.org/10.3390/s22093597