Cargando…

A New CNN-Based Single-Ingredient Classification Model and Its Application in Food Image Segmentation

It is important for food recognition to separate each ingredient within a food image at the pixel level. Most existing research has trained a segmentation network on datasets with pixel-level annotations to achieve food ingredient segmentation. However, preparing such datasets is exceedingly hard an...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Ziyi, Dai, Ying
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10607895/
https://www.ncbi.nlm.nih.gov/pubmed/37888312
http://dx.doi.org/10.3390/jimaging9100205
Descripción
Sumario:It is important for food recognition to separate each ingredient within a food image at the pixel level. Most existing research has trained a segmentation network on datasets with pixel-level annotations to achieve food ingredient segmentation. However, preparing such datasets is exceedingly hard and time-consuming. In this paper, we propose a new framework for ingredient segmentation utilizing feature maps of the CNN-based Single-Ingredient Classification Model that is trained on the dataset with image-level annotation. To train this model, we first introduce a standardized biological-based hierarchical ingredient structure and construct a single-ingredient image dataset based on this structure. Then, we build a single-ingredient classification model on this dataset as the backbone of the proposed framework. In this framework, we extract feature maps from the single-ingredient classification model and propose two methods for processing these feature maps for segmenting ingredients in the food images. We introduce five evaluation metrics (IoU, Dice, Purity, Entirety, and Loss of GTs) to assess the performance of ingredient segmentation in terms of ingredient classification. Extensive experiments demonstrate the effectiveness of the proposed method, achieving a mIoU of 0.65, mDice of 0.77, mPurity of 0.83, mEntirety of 0.80, and mLoGTs of 0.06 for the optimal model on the FoodSeg103 dataset. We believe that our approach lays the foundation for subsequent ingredient recognition.