Cargando…

Progressive Learning of a Multimodal Classifier Accounting for Different Modality Combinations

In classification tasks, such as face recognition and emotion recognition, multimodal information is used for accurate classification. Once a multimodal classification model is trained with a set of modalities, it estimates the class label by using the entire modality set. A trained classifier is ty...

Descripción completa

Detalles Bibliográficos
Autores principales:	John, Vijay, Kawanishi, Yasutomo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10223146/ https://www.ncbi.nlm.nih.gov/pubmed/37430579 http://dx.doi.org/10.3390/s23104666

_version_	1785049871428878336
author	John, Vijay Kawanishi, Yasutomo
author_facet	John, Vijay Kawanishi, Yasutomo
author_sort	John, Vijay
collection	PubMed
description	In classification tasks, such as face recognition and emotion recognition, multimodal information is used for accurate classification. Once a multimodal classification model is trained with a set of modalities, it estimates the class label by using the entire modality set. A trained classifier is typically not formulated to perform classification for various subsets of modalities. Thus, the model would be useful and portable if it could be used for any subset of modalities. We refer to this problem as the multimodal portability problem. Moreover, in the multimodal model, classification accuracy is reduced when one or more modalities are missing. We term this problem the missing modality problem. This article proposes a novel deep learning model, termed KModNet, and a novel learning strategy, termed progressive learning, to simultaneously address missing modality and multimodal portability problems. KModNet, formulated with the transformer, contains multiple branches corresponding to different k-combinations of the modality set S. KModNet is trained using a multi-step progressive learning framework, where the k-th step uses a k-modal model to train different branches up to the k-th combination branch. To address the missing modality problem, the training multimodal data is randomly ablated. The proposed learning framework is formulated and validated using two multimodal classification problems: audio-video-thermal person classification and audio-video emotion classification. The two classification problems are validated using the Speaking Faces, RAVDESS, and SAVEE datasets. The results demonstrate that the progressive learning framework enhances the robustness of multimodal classification, even under the conditions of missing modalities, while being portable to different modality subsets.
format	Online Article Text
id	pubmed-10223146
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-102231462023-05-28 Progressive Learning of a Multimodal Classifier Accounting for Different Modality Combinations John, Vijay Kawanishi, Yasutomo Sensors (Basel) Article In classification tasks, such as face recognition and emotion recognition, multimodal information is used for accurate classification. Once a multimodal classification model is trained with a set of modalities, it estimates the class label by using the entire modality set. A trained classifier is typically not formulated to perform classification for various subsets of modalities. Thus, the model would be useful and portable if it could be used for any subset of modalities. We refer to this problem as the multimodal portability problem. Moreover, in the multimodal model, classification accuracy is reduced when one or more modalities are missing. We term this problem the missing modality problem. This article proposes a novel deep learning model, termed KModNet, and a novel learning strategy, termed progressive learning, to simultaneously address missing modality and multimodal portability problems. KModNet, formulated with the transformer, contains multiple branches corresponding to different k-combinations of the modality set S. KModNet is trained using a multi-step progressive learning framework, where the k-th step uses a k-modal model to train different branches up to the k-th combination branch. To address the missing modality problem, the training multimodal data is randomly ablated. The proposed learning framework is formulated and validated using two multimodal classification problems: audio-video-thermal person classification and audio-video emotion classification. The two classification problems are validated using the Speaking Faces, RAVDESS, and SAVEE datasets. The results demonstrate that the progressive learning framework enhances the robustness of multimodal classification, even under the conditions of missing modalities, while being portable to different modality subsets. MDPI 2023-05-11 /pmc/articles/PMC10223146/ /pubmed/37430579 http://dx.doi.org/10.3390/s23104666 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article John, Vijay Kawanishi, Yasutomo Progressive Learning of a Multimodal Classifier Accounting for Different Modality Combinations
title	Progressive Learning of a Multimodal Classifier Accounting for Different Modality Combinations
title_full	Progressive Learning of a Multimodal Classifier Accounting for Different Modality Combinations
title_fullStr	Progressive Learning of a Multimodal Classifier Accounting for Different Modality Combinations
title_full_unstemmed	Progressive Learning of a Multimodal Classifier Accounting for Different Modality Combinations
title_short	Progressive Learning of a Multimodal Classifier Accounting for Different Modality Combinations
title_sort	progressive learning of a multimodal classifier accounting for different modality combinations
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10223146/ https://www.ncbi.nlm.nih.gov/pubmed/37430579 http://dx.doi.org/10.3390/s23104666
work_keys_str_mv	AT johnvijay progressivelearningofamultimodalclassifieraccountingfordifferentmodalitycombinations AT kawanishiyasutomo progressivelearningofamultimodalclassifieraccountingfordifferentmodalitycombinations

Progressive Learning of a Multimodal Classifier Accounting for Different Modality Combinations

Ejemplares similares