Cargando…

Context-Aware Automatic Sign Language Video Transcription in Psychiatric Interviews †

Sign language (SL) translation constitutes an extremely challenging task when undertaken in a general unconstrained setup, especially in the absence of vast training datasets that enable the use of end-to-end solutions employing deep architectures. In such cases, the ability to incorporate prior inf...

Descripción completa

Detalles Bibliográficos
Autores principales: Pikoulis, Erion-Vasilis, Bifis, Aristeidis, Trigka, Maria, Constantinopoulos, Constantinos, Kosmopoulos, Dimitrios
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9003308/
https://www.ncbi.nlm.nih.gov/pubmed/35408270
http://dx.doi.org/10.3390/s22072656
_version_ 1784686102356951040
author Pikoulis, Erion-Vasilis
Bifis, Aristeidis
Trigka, Maria
Constantinopoulos, Constantinos
Kosmopoulos, Dimitrios
author_facet Pikoulis, Erion-Vasilis
Bifis, Aristeidis
Trigka, Maria
Constantinopoulos, Constantinos
Kosmopoulos, Dimitrios
author_sort Pikoulis, Erion-Vasilis
collection PubMed
description Sign language (SL) translation constitutes an extremely challenging task when undertaken in a general unconstrained setup, especially in the absence of vast training datasets that enable the use of end-to-end solutions employing deep architectures. In such cases, the ability to incorporate prior information can yield a significant improvement in the translation results by greatly restricting the search space of the potential solutions. In this work, we treat the translation problem in the limited confines of psychiatric interviews involving doctor-patient diagnostic sessions for deaf and hard of hearing patients with mental health problems.To overcome the lack of extensive training data and be able to improve the obtained translation performance, we follow a domain-specific approach combining data-driven feature extraction with the incorporation of prior information drawn from the available domain knowledge. This knowledge enables us to model the context of the interviews by using an appropriately defined hierarchical ontology for the contained dialogue, allowing for the classification of the current state of the interview, based on the doctor’s question. Utilizing this information, video transcription is treated as a sentence retrieval problem. The goal is predicting the patient’s sentence that has been signed in the SL video based on the available pool of possible responses, given the context of the current exchange. Our experimental evaluation using simulated scenarios of psychiatric interviews demonstrate the significant gains of incorporating context awareness in the system’s decisions.
format Online
Article
Text
id pubmed-9003308
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-90033082022-04-13 Context-Aware Automatic Sign Language Video Transcription in Psychiatric Interviews † Pikoulis, Erion-Vasilis Bifis, Aristeidis Trigka, Maria Constantinopoulos, Constantinos Kosmopoulos, Dimitrios Sensors (Basel) Article Sign language (SL) translation constitutes an extremely challenging task when undertaken in a general unconstrained setup, especially in the absence of vast training datasets that enable the use of end-to-end solutions employing deep architectures. In such cases, the ability to incorporate prior information can yield a significant improvement in the translation results by greatly restricting the search space of the potential solutions. In this work, we treat the translation problem in the limited confines of psychiatric interviews involving doctor-patient diagnostic sessions for deaf and hard of hearing patients with mental health problems.To overcome the lack of extensive training data and be able to improve the obtained translation performance, we follow a domain-specific approach combining data-driven feature extraction with the incorporation of prior information drawn from the available domain knowledge. This knowledge enables us to model the context of the interviews by using an appropriately defined hierarchical ontology for the contained dialogue, allowing for the classification of the current state of the interview, based on the doctor’s question. Utilizing this information, video transcription is treated as a sentence retrieval problem. The goal is predicting the patient’s sentence that has been signed in the SL video based on the available pool of possible responses, given the context of the current exchange. Our experimental evaluation using simulated scenarios of psychiatric interviews demonstrate the significant gains of incorporating context awareness in the system’s decisions. MDPI 2022-03-30 /pmc/articles/PMC9003308/ /pubmed/35408270 http://dx.doi.org/10.3390/s22072656 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Pikoulis, Erion-Vasilis
Bifis, Aristeidis
Trigka, Maria
Constantinopoulos, Constantinos
Kosmopoulos, Dimitrios
Context-Aware Automatic Sign Language Video Transcription in Psychiatric Interviews †
title Context-Aware Automatic Sign Language Video Transcription in Psychiatric Interviews †
title_full Context-Aware Automatic Sign Language Video Transcription in Psychiatric Interviews †
title_fullStr Context-Aware Automatic Sign Language Video Transcription in Psychiatric Interviews †
title_full_unstemmed Context-Aware Automatic Sign Language Video Transcription in Psychiatric Interviews †
title_short Context-Aware Automatic Sign Language Video Transcription in Psychiatric Interviews †
title_sort context-aware automatic sign language video transcription in psychiatric interviews †
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9003308/
https://www.ncbi.nlm.nih.gov/pubmed/35408270
http://dx.doi.org/10.3390/s22072656
work_keys_str_mv AT pikouliserionvasilis contextawareautomaticsignlanguagevideotranscriptioninpsychiatricinterviews
AT bifisaristeidis contextawareautomaticsignlanguagevideotranscriptioninpsychiatricinterviews
AT trigkamaria contextawareautomaticsignlanguagevideotranscriptioninpsychiatricinterviews
AT constantinopoulosconstantinos contextawareautomaticsignlanguagevideotranscriptioninpsychiatricinterviews
AT kosmopoulosdimitrios contextawareautomaticsignlanguagevideotranscriptioninpsychiatricinterviews