Cargando…

Are open set classification methods effective on large-scale datasets?

Supervised classification methods often assume the train and test data distributions are the same and that all classes in the test set are present in the training set. However, deployed classifiers often require the ability to recognize inputs from outside the training set as unknowns. This problem...

Descripción completa

Detalles Bibliográficos
Autores principales:	Roady, Ryne, Hayes, Tyler L., Kemker, Ronald, Gonzales, Ayesha, Kanan, Christopher
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2020
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7473573/ https://www.ncbi.nlm.nih.gov/pubmed/32886692 http://dx.doi.org/10.1371/journal.pone.0238302

_version_	1783579203733553152
author	Roady, Ryne Hayes, Tyler L. Kemker, Ronald Gonzales, Ayesha Kanan, Christopher
author_facet	Roady, Ryne Hayes, Tyler L. Kemker, Ronald Gonzales, Ayesha Kanan, Christopher
author_sort	Roady, Ryne
collection	PubMed
description	Supervised classification methods often assume the train and test data distributions are the same and that all classes in the test set are present in the training set. However, deployed classifiers often require the ability to recognize inputs from outside the training set as unknowns. This problem has been studied under multiple paradigms including out-of-distribution detection and open set recognition. For convolutional neural networks, there have been two major approaches: 1) inference methods to separate knowns from unknowns and 2) feature space regularization strategies to improve model robustness to novel inputs. Up to this point, there has been little attention to exploring the relationship between the two approaches and directly comparing performance on large-scale datasets that have more than a few dozen categories. Using the ImageNet ILSVRC-2012 large-scale classification dataset, we identify novel combinations of regularization and specialized inference methods that perform best across multiple open set classification problems of increasing difficulty level. We find that input perturbation and temperature scaling yield significantly better performance on large-scale datasets than other inference methods tested, regardless of the feature space regularization strategy. Conversely, we find that improving performance with advanced regularization schemes during training yields better performance when baseline inference techniques are used; however, when advanced inference methods are used to detect open set classes, the utility of these combersome training paradigms is less evident.
format	Online Article Text
id	pubmed-7473573
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-74735732020-09-14 Are open set classification methods effective on large-scale datasets? Roady, Ryne Hayes, Tyler L. Kemker, Ronald Gonzales, Ayesha Kanan, Christopher PLoS One Research Article Supervised classification methods often assume the train and test data distributions are the same and that all classes in the test set are present in the training set. However, deployed classifiers often require the ability to recognize inputs from outside the training set as unknowns. This problem has been studied under multiple paradigms including out-of-distribution detection and open set recognition. For convolutional neural networks, there have been two major approaches: 1) inference methods to separate knowns from unknowns and 2) feature space regularization strategies to improve model robustness to novel inputs. Up to this point, there has been little attention to exploring the relationship between the two approaches and directly comparing performance on large-scale datasets that have more than a few dozen categories. Using the ImageNet ILSVRC-2012 large-scale classification dataset, we identify novel combinations of regularization and specialized inference methods that perform best across multiple open set classification problems of increasing difficulty level. We find that input perturbation and temperature scaling yield significantly better performance on large-scale datasets than other inference methods tested, regardless of the feature space regularization strategy. Conversely, we find that improving performance with advanced regularization schemes during training yields better performance when baseline inference techniques are used; however, when advanced inference methods are used to detect open set classes, the utility of these combersome training paradigms is less evident. Public Library of Science 2020-09-04 /pmc/articles/PMC7473573/ /pubmed/32886692 http://dx.doi.org/10.1371/journal.pone.0238302 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle	Research Article Roady, Ryne Hayes, Tyler L. Kemker, Ronald Gonzales, Ayesha Kanan, Christopher Are open set classification methods effective on large-scale datasets?
title	Are open set classification methods effective on large-scale datasets?
title_full	Are open set classification methods effective on large-scale datasets?
title_fullStr	Are open set classification methods effective on large-scale datasets?
title_full_unstemmed	Are open set classification methods effective on large-scale datasets?
title_short	Are open set classification methods effective on large-scale datasets?
title_sort	are open set classification methods effective on large-scale datasets?
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7473573/ https://www.ncbi.nlm.nih.gov/pubmed/32886692 http://dx.doi.org/10.1371/journal.pone.0238302
work_keys_str_mv	AT roadyryne areopensetclassificationmethodseffectiveonlargescaledatasets AT hayestylerl areopensetclassificationmethodseffectiveonlargescaledatasets AT kemkerronald areopensetclassificationmethodseffectiveonlargescaledatasets AT gonzalesayesha areopensetclassificationmethodseffectiveonlargescaledatasets AT kananchristopher areopensetclassificationmethodseffectiveonlargescaledatasets

Are open set classification methods effective on large-scale datasets?

Ejemplares similares