Cargando…
One-Shot Learning with Pseudo-Labeling for Cattle Video Segmentation in Smart Livestock Farming
SIMPLE SUMMARY: Deep learning-based segmentation methods rely on large-scale pixel-labeled datasets to achieve good performance. However, it is resource-costly to label animal images due to their irregular contours and changing postures. To keep a balance between segmentation accuracy and speed usin...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908826/ https://www.ncbi.nlm.nih.gov/pubmed/35268130 http://dx.doi.org/10.3390/ani12050558 |
Sumario: | SIMPLE SUMMARY: Deep learning-based segmentation methods rely on large-scale pixel-labeled datasets to achieve good performance. However, it is resource-costly to label animal images due to their irregular contours and changing postures. To keep a balance between segmentation accuracy and speed using limited label data, we propose a one-shot learning-based approach with pseudo-labeling to segment animals in videos, relying on only one labeled frame. Experiments were conducted on a challenging feedlot cattle video dataset acquired by the authors, and the results show that the proposed method outperformed state-of-the-art methods such as one-shot video object segmentation (OSVOS) and one-shot modulation network (OSMN). Our proposed one-shot learning with pseudo-labeling reduces the reliance on labeled data and could serve as an enabling component for smart farming-related applications. ABSTRACT: Computer vision-based technologies play a key role in precision livestock farming, and video-based analysis approaches have been advocated as useful tools for automatic animal monitoring, behavior analysis, and efficient welfare measurement management. Accurately and efficiently segmenting animals’ contours from their backgrounds is a prerequisite for vision-based technologies. Deep learning-based segmentation methods have shown good performance through training models on a large amount of pixel-labeled images. However, it is challenging and time-consuming to label animal images due to their irregular contours and changing postures. In order to reduce the reliance on the number of labeled images, one-shot learning with a pseudo-labeling approach is proposed using only one labeled image frame to segment animals in videos. The proposed approach is mainly comprised of an Xception-based Fully Convolutional Neural Network (Xception-FCN) module and a pseudo-labeling (PL) module. Xception-FCN utilizes depth-wise separable convolutions to learn different-level visual features and localize dense prediction based on the one single labeled frame. Then, PL leverages the segmentation results of the Xception-FCN model to fine-tune the model, leading to performance boosts in cattle video segmentation. Systematic experiments were conducted on a challenging feedlot cattle video dataset acquired by the authors, and the proposed approach achieved a mean intersection-over-union score of 88.7% and a contour accuracy of 80.8%, outperforming state-of-the-art methods (OSVOS and OSMN). Our proposed one-shot learning approach could serve as an enabling component for livestock farming-related segmentation and detection applications. |
---|