Cargando…

SpeakingFaces: A Large-Scale Multimodal Dataset of Voice Commands with Visual and Thermal Video Streams

We present SpeakingFaces as a publicly-available large-scale multimodal dataset developed to support machine learning research in contexts that utilize a combination of thermal, visual, and audio data streams; examples include human–computer interaction, biometric authentication, recognition systems...

Descripción completa

Detalles Bibliográficos
Autores principales: Abdrakhmanova, Madina, Kuzdeuov, Askat, Jarju, Sheikh, Khassanov, Yerbolat, Lewis, Michael, Varol, Huseyin Atakan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8156799/
https://www.ncbi.nlm.nih.gov/pubmed/34065700
http://dx.doi.org/10.3390/s21103465