Cargando…

Improving work detection by segmentation heuristics pre-training on factory operations video

The measurement of work time for individual tasks by using video has made a significant contribution to a framework for productivity improvement such as value stream mapping (VSM). In the past, the work time has been often measured manually, but this process is quite costly and labor-intensive. For...

Descripción completa

Detalles Bibliográficos
Autores principales: Kataoka, Shotaro, Ito, Tetsuro, Iwaka, Genki, Oba, Masashi, Nonaka, Hirofumi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9173629/
https://www.ncbi.nlm.nih.gov/pubmed/35671292
http://dx.doi.org/10.1371/journal.pone.0267457
Descripción
Sumario:The measurement of work time for individual tasks by using video has made a significant contribution to a framework for productivity improvement such as value stream mapping (VSM). In the past, the work time has been often measured manually, but this process is quite costly and labor-intensive. For these reasons, automation of work analysis at the worksite is needed. There are two main methods for computing spatio-temporal information: by 3D-CNN, and by temporal computation using LSTM after feature extraction in the spatial domain by 2D-CNN. These methods has high computational cost but high model representational power, and the latter has low computational cost but relatively low model representational power. In the manufacturing industry, the use of local computers to make inferences is often required for practicality and confidentiality reasons, necessitating a low computational cost, and so the latter, a lightweight model, needs to have improved performance. Therefore, in this paper, we propose a method that pre-trains the image encoder module of a work detection model using an image segmentation model. This is based on the CNN-LSTM structure, which separates spatial and temporal computation and enables us to include heuristics such as workers’ body parts and work tools in the CNN module. Experimental results demonstrate that our pre-training method reduces over-fitting and provides a greater improvement in detection performance than pre-training on ImageNet.