Cargando…

MULTIMODAL AFFECTIVE ANALYSIS OF FACIAL AND VOCAL EXPRESSIVITY USING SMARTPHONE AND DEEP LEARNING ANALYSIS

Limited expressivity of emotion is one of the most common symptoms of major depression, particularly in older adults. Although assessing facial and vocal expressivity is very important for accurate clinical evaluation of geriatric depression, research has rarely examined older adults via telehealth...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Heejung, Cho, Youngshin, Lee, Sunghee, Kang, Chaehyeon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9766221/
http://dx.doi.org/10.1093/geroni/igac059.2221
_version_ 1784853677251493888
author Kim, Heejung
Cho, Youngshin
Lee, Sunghee
Kang, Chaehyeon
author_facet Kim, Heejung
Cho, Youngshin
Lee, Sunghee
Kang, Chaehyeon
author_sort Kim, Heejung
collection PubMed
description Limited expressivity of emotion is one of the most common symptoms of major depression, particularly in older adults. Although assessing facial and vocal expressivity is very important for accurate clinical evaluation of geriatric depression, research has rarely examined older adults via telehealth technology. This study aims to quantify facial and vocal expressivity via a multimodal affective system with deep learning. A total of 19 Korean adults aged over 65 years with severe depressive symptoms participated in this research. Using smartphone video recording, 1,429 facial and vocal data were collected between July and December 2020. Recorded videos were transmitted automatically to the cloud system. Basic facial movements were extracted using combined video frames and mel spectrogram images. Compared to the AI hub of Korean images from big data, mood status was classified into seven categories (anger, disgust, fear, happiness, neutrality, sadness, and surprise). Frequencies of each mood were coded into continuous variables for each participant in each recording. When comparing video and text prediction to determine “true labels,” the overall accuracy was 0.69, with F1 scores ranging from 0.57 to 0.79. In addition, the most common emotions were angry, happy, neutral, sad, and surprised. This study suggests that smartphone-recorded video could function as a useful tool for quantifying mood expressivity. This study established a preliminary method of affective assessment for older adults for telecare use based on socially assistive technology at a distance from the clinic.
format Online
Article
Text
id pubmed-9766221
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-97662212022-12-20 MULTIMODAL AFFECTIVE ANALYSIS OF FACIAL AND VOCAL EXPRESSIVITY USING SMARTPHONE AND DEEP LEARNING ANALYSIS Kim, Heejung Cho, Youngshin Lee, Sunghee Kang, Chaehyeon Innov Aging Abstracts Limited expressivity of emotion is one of the most common symptoms of major depression, particularly in older adults. Although assessing facial and vocal expressivity is very important for accurate clinical evaluation of geriatric depression, research has rarely examined older adults via telehealth technology. This study aims to quantify facial and vocal expressivity via a multimodal affective system with deep learning. A total of 19 Korean adults aged over 65 years with severe depressive symptoms participated in this research. Using smartphone video recording, 1,429 facial and vocal data were collected between July and December 2020. Recorded videos were transmitted automatically to the cloud system. Basic facial movements were extracted using combined video frames and mel spectrogram images. Compared to the AI hub of Korean images from big data, mood status was classified into seven categories (anger, disgust, fear, happiness, neutrality, sadness, and surprise). Frequencies of each mood were coded into continuous variables for each participant in each recording. When comparing video and text prediction to determine “true labels,” the overall accuracy was 0.69, with F1 scores ranging from 0.57 to 0.79. In addition, the most common emotions were angry, happy, neutral, sad, and surprised. This study suggests that smartphone-recorded video could function as a useful tool for quantifying mood expressivity. This study established a preliminary method of affective assessment for older adults for telecare use based on socially assistive technology at a distance from the clinic. Oxford University Press 2022-12-20 /pmc/articles/PMC9766221/ http://dx.doi.org/10.1093/geroni/igac059.2221 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of The Gerontological Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Abstracts
Kim, Heejung
Cho, Youngshin
Lee, Sunghee
Kang, Chaehyeon
MULTIMODAL AFFECTIVE ANALYSIS OF FACIAL AND VOCAL EXPRESSIVITY USING SMARTPHONE AND DEEP LEARNING ANALYSIS
title MULTIMODAL AFFECTIVE ANALYSIS OF FACIAL AND VOCAL EXPRESSIVITY USING SMARTPHONE AND DEEP LEARNING ANALYSIS
title_full MULTIMODAL AFFECTIVE ANALYSIS OF FACIAL AND VOCAL EXPRESSIVITY USING SMARTPHONE AND DEEP LEARNING ANALYSIS
title_fullStr MULTIMODAL AFFECTIVE ANALYSIS OF FACIAL AND VOCAL EXPRESSIVITY USING SMARTPHONE AND DEEP LEARNING ANALYSIS
title_full_unstemmed MULTIMODAL AFFECTIVE ANALYSIS OF FACIAL AND VOCAL EXPRESSIVITY USING SMARTPHONE AND DEEP LEARNING ANALYSIS
title_short MULTIMODAL AFFECTIVE ANALYSIS OF FACIAL AND VOCAL EXPRESSIVITY USING SMARTPHONE AND DEEP LEARNING ANALYSIS
title_sort multimodal affective analysis of facial and vocal expressivity using smartphone and deep learning analysis
topic Abstracts
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9766221/
http://dx.doi.org/10.1093/geroni/igac059.2221
work_keys_str_mv AT kimheejung multimodalaffectiveanalysisoffacialandvocalexpressivityusingsmartphoneanddeeplearninganalysis
AT choyoungshin multimodalaffectiveanalysisoffacialandvocalexpressivityusingsmartphoneanddeeplearninganalysis
AT leesunghee multimodalaffectiveanalysisoffacialandvocalexpressivityusingsmartphoneanddeeplearninganalysis
AT kangchaehyeon multimodalaffectiveanalysisoffacialandvocalexpressivityusingsmartphoneanddeeplearninganalysis