Cargando…
A Facial Feature and Lip Movement Enhanced Audio-Visual Speech Separation Model
The cocktail party problem can be more effectively addressed by leveraging the speaker’s visual and audio information. This paper proposes a method to improve the audio’s separation using two visual cues: facial features and lip movement. Firstly, residual connections are introduced in the audio sep...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10647675/ https://www.ncbi.nlm.nih.gov/pubmed/37960477 http://dx.doi.org/10.3390/s23218770 |