Cargando…

CNN-Based Multimodal Human Recognition in Surveillance Environments

In the current field of human recognition, most of the research being performed currently is focused on re-identification of different body images taken by several cameras in an outdoor environment. On the other hand, there is almost no research being performed on indoor human recognition. Previous...

Descripción completa

Detalles Bibliográficos
Autores principales: Koo, Ja Hyung, Cho, Se Woon, Baek, Na Rae, Kim, Min Cheol, Park, Kang Ryoung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6164664/
https://www.ncbi.nlm.nih.gov/pubmed/30208648
http://dx.doi.org/10.3390/s18093040
_version_ 1783359654633406464
author Koo, Ja Hyung
Cho, Se Woon
Baek, Na Rae
Kim, Min Cheol
Park, Kang Ryoung
author_facet Koo, Ja Hyung
Cho, Se Woon
Baek, Na Rae
Kim, Min Cheol
Park, Kang Ryoung
author_sort Koo, Ja Hyung
collection PubMed
description In the current field of human recognition, most of the research being performed currently is focused on re-identification of different body images taken by several cameras in an outdoor environment. On the other hand, there is almost no research being performed on indoor human recognition. Previous research on indoor recognition has mainly focused on face recognition because the camera is usually closer to a person in an indoor environment than an outdoor environment. However, due to the nature of indoor surveillance cameras, which are installed near the ceiling and capture images from above in a downward direction, people do not look directly at the cameras in most cases. Thus, it is often difficult to capture front face images, and when this is the case, facial recognition accuracy is greatly reduced. To overcome this problem, we can consider using the face and body for human recognition. However, when images are captured by indoor cameras rather than outdoor cameras, in many cases only part of the target body is included in the camera viewing angle and only part of the body is captured, which reduces the accuracy of human recognition. To address all of these problems, this paper proposes a multimodal human recognition method that uses both the face and body and is based on a deep convolutional neural network (CNN). Specifically, to solve the problem of not capturing part of the body, the results of recognizing the face and body through separate CNNs of VGG Face-16 and ResNet-50 are combined based on the score-level fusion by Weighted Sum rule to improve recognition performance. The results of experiments conducted using the custom-made Dongguk face and body database (DFB-DB1) and the open ChokePoint database demonstrate that the method proposed in this study achieves high recognition accuracy (the equal error rates of 1.52% and 0.58%, respectively) in comparison to face or body single modality-based recognition and other methods used in previous studies.
format Online
Article
Text
id pubmed-6164664
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-61646642018-10-10 CNN-Based Multimodal Human Recognition in Surveillance Environments Koo, Ja Hyung Cho, Se Woon Baek, Na Rae Kim, Min Cheol Park, Kang Ryoung Sensors (Basel) Article In the current field of human recognition, most of the research being performed currently is focused on re-identification of different body images taken by several cameras in an outdoor environment. On the other hand, there is almost no research being performed on indoor human recognition. Previous research on indoor recognition has mainly focused on face recognition because the camera is usually closer to a person in an indoor environment than an outdoor environment. However, due to the nature of indoor surveillance cameras, which are installed near the ceiling and capture images from above in a downward direction, people do not look directly at the cameras in most cases. Thus, it is often difficult to capture front face images, and when this is the case, facial recognition accuracy is greatly reduced. To overcome this problem, we can consider using the face and body for human recognition. However, when images are captured by indoor cameras rather than outdoor cameras, in many cases only part of the target body is included in the camera viewing angle and only part of the body is captured, which reduces the accuracy of human recognition. To address all of these problems, this paper proposes a multimodal human recognition method that uses both the face and body and is based on a deep convolutional neural network (CNN). Specifically, to solve the problem of not capturing part of the body, the results of recognizing the face and body through separate CNNs of VGG Face-16 and ResNet-50 are combined based on the score-level fusion by Weighted Sum rule to improve recognition performance. The results of experiments conducted using the custom-made Dongguk face and body database (DFB-DB1) and the open ChokePoint database demonstrate that the method proposed in this study achieves high recognition accuracy (the equal error rates of 1.52% and 0.58%, respectively) in comparison to face or body single modality-based recognition and other methods used in previous studies. MDPI 2018-09-11 /pmc/articles/PMC6164664/ /pubmed/30208648 http://dx.doi.org/10.3390/s18093040 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Koo, Ja Hyung
Cho, Se Woon
Baek, Na Rae
Kim, Min Cheol
Park, Kang Ryoung
CNN-Based Multimodal Human Recognition in Surveillance Environments
title CNN-Based Multimodal Human Recognition in Surveillance Environments
title_full CNN-Based Multimodal Human Recognition in Surveillance Environments
title_fullStr CNN-Based Multimodal Human Recognition in Surveillance Environments
title_full_unstemmed CNN-Based Multimodal Human Recognition in Surveillance Environments
title_short CNN-Based Multimodal Human Recognition in Surveillance Environments
title_sort cnn-based multimodal human recognition in surveillance environments
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6164664/
https://www.ncbi.nlm.nih.gov/pubmed/30208648
http://dx.doi.org/10.3390/s18093040
work_keys_str_mv AT koojahyung cnnbasedmultimodalhumanrecognitioninsurveillanceenvironments
AT chosewoon cnnbasedmultimodalhumanrecognitioninsurveillanceenvironments
AT baeknarae cnnbasedmultimodalhumanrecognitioninsurveillanceenvironments
AT kimmincheol cnnbasedmultimodalhumanrecognitioninsurveillanceenvironments
AT parkkangryoung cnnbasedmultimodalhumanrecognitioninsurveillanceenvironments