Cargando…

Embracing limited and imperfect training datasets: opportunities and challenges in plant disease recognition using deep learning

Recent advancements in deep learning have brought significant improvements to plant disease recognition. However, achieving satisfactory performance often requires high-quality training datasets, which are challenging and expensive to collect. Consequently, the practical application of current deep...

Descripción completa

Detalles Bibliográficos
Autores principales:	Xu, Mingle, Kim, Hyongsuk, Yang, Jucheng, Fuentes, Alvaro, Meng, Yao, Yoon, Sook, Kim, Taehyun, Park, Dong Sun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2023
Materias:	Plant Science
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10557492/ https://www.ncbi.nlm.nih.gov/pubmed/37810377 http://dx.doi.org/10.3389/fpls.2023.1225409

_version_	1785117101447446528
author	Xu, Mingle Kim, Hyongsuk Yang, Jucheng Fuentes, Alvaro Meng, Yao Yoon, Sook Kim, Taehyun Park, Dong Sun
author_facet	Xu, Mingle Kim, Hyongsuk Yang, Jucheng Fuentes, Alvaro Meng, Yao Yoon, Sook Kim, Taehyun Park, Dong Sun
author_sort	Xu, Mingle
collection	PubMed
description	Recent advancements in deep learning have brought significant improvements to plant disease recognition. However, achieving satisfactory performance often requires high-quality training datasets, which are challenging and expensive to collect. Consequently, the practical application of current deep learning–based methods in real-world scenarios is hindered by the scarcity of high-quality datasets. In this paper, we argue that embracing poor datasets is viable and aims to explicitly define the challenges associated with using these datasets. To delve into this topic, we analyze the characteristics of high-quality datasets, namely, large-scale images and desired annotation, and contrast them with the limited and imperfect nature of poor datasets. Challenges arise when the training datasets deviate from these characteristics. To provide a comprehensive understanding, we propose a novel and informative taxonomy that categorizes these challenges. Furthermore, we offer a brief overview of existing studies and approaches that address these challenges. We point out that our paper sheds light on the importance of embracing poor datasets, enhances the understanding of the associated challenges, and contributes to the ambitious objective of deploying deep learning in real-world applications. To facilitate the progress, we finally describe several outstanding questions and point out potential future directions. Although our primary focus is on plant disease recognition, we emphasize that the principles of embracing and analyzing poor datasets are applicable to a wider range of domains, including agriculture. Our project is public available at https://github.com/xml94/EmbracingLimitedImperfectTrainingDatasets.
format	Online Article Text
id	pubmed-10557492
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-105574922023-10-07 Embracing limited and imperfect training datasets: opportunities and challenges in plant disease recognition using deep learning Xu, Mingle Kim, Hyongsuk Yang, Jucheng Fuentes, Alvaro Meng, Yao Yoon, Sook Kim, Taehyun Park, Dong Sun Front Plant Sci Plant Science Recent advancements in deep learning have brought significant improvements to plant disease recognition. However, achieving satisfactory performance often requires high-quality training datasets, which are challenging and expensive to collect. Consequently, the practical application of current deep learning–based methods in real-world scenarios is hindered by the scarcity of high-quality datasets. In this paper, we argue that embracing poor datasets is viable and aims to explicitly define the challenges associated with using these datasets. To delve into this topic, we analyze the characteristics of high-quality datasets, namely, large-scale images and desired annotation, and contrast them with the limited and imperfect nature of poor datasets. Challenges arise when the training datasets deviate from these characteristics. To provide a comprehensive understanding, we propose a novel and informative taxonomy that categorizes these challenges. Furthermore, we offer a brief overview of existing studies and approaches that address these challenges. We point out that our paper sheds light on the importance of embracing poor datasets, enhances the understanding of the associated challenges, and contributes to the ambitious objective of deploying deep learning in real-world applications. To facilitate the progress, we finally describe several outstanding questions and point out potential future directions. Although our primary focus is on plant disease recognition, we emphasize that the principles of embracing and analyzing poor datasets are applicable to a wider range of domains, including agriculture. Our project is public available at https://github.com/xml94/EmbracingLimitedImperfectTrainingDatasets. Frontiers Media S.A. 2023-09-22 /pmc/articles/PMC10557492/ /pubmed/37810377 http://dx.doi.org/10.3389/fpls.2023.1225409 Text en Copyright © 2023 Xu, Kim, Yang, Fuentes, Meng, Yoon, Kim and Park https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Plant Science Xu, Mingle Kim, Hyongsuk Yang, Jucheng Fuentes, Alvaro Meng, Yao Yoon, Sook Kim, Taehyun Park, Dong Sun Embracing limited and imperfect training datasets: opportunities and challenges in plant disease recognition using deep learning
title	Embracing limited and imperfect training datasets: opportunities and challenges in plant disease recognition using deep learning
title_full	Embracing limited and imperfect training datasets: opportunities and challenges in plant disease recognition using deep learning
title_fullStr	Embracing limited and imperfect training datasets: opportunities and challenges in plant disease recognition using deep learning
title_full_unstemmed	Embracing limited and imperfect training datasets: opportunities and challenges in plant disease recognition using deep learning
title_short	Embracing limited and imperfect training datasets: opportunities and challenges in plant disease recognition using deep learning
title_sort	embracing limited and imperfect training datasets: opportunities and challenges in plant disease recognition using deep learning
topic	Plant Science
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10557492/ https://www.ncbi.nlm.nih.gov/pubmed/37810377 http://dx.doi.org/10.3389/fpls.2023.1225409
work_keys_str_mv	AT xumingle embracinglimitedandimperfecttrainingdatasetsopportunitiesandchallengesinplantdiseaserecognitionusingdeeplearning AT kimhyongsuk embracinglimitedandimperfecttrainingdatasetsopportunitiesandchallengesinplantdiseaserecognitionusingdeeplearning AT yangjucheng embracinglimitedandimperfecttrainingdatasetsopportunitiesandchallengesinplantdiseaserecognitionusingdeeplearning AT fuentesalvaro embracinglimitedandimperfecttrainingdatasetsopportunitiesandchallengesinplantdiseaserecognitionusingdeeplearning AT mengyao embracinglimitedandimperfecttrainingdatasetsopportunitiesandchallengesinplantdiseaserecognitionusingdeeplearning AT yoonsook embracinglimitedandimperfecttrainingdatasetsopportunitiesandchallengesinplantdiseaserecognitionusingdeeplearning AT kimtaehyun embracinglimitedandimperfecttrainingdatasetsopportunitiesandchallengesinplantdiseaserecognitionusingdeeplearning AT parkdongsun embracinglimitedandimperfecttrainingdatasetsopportunitiesandchallengesinplantdiseaserecognitionusingdeeplearning

Embracing limited and imperfect training datasets: opportunities and challenges in plant disease recognition using deep learning

Ejemplares similares