Cargando…

Wide and deep neural networks achieve consistency for classification

While neural networks are used for classification tasks across domains, a long-standing open problem in machine learning is determining whether neural networks trained using standard procedures are consistent for classification, i.e., whether such models minimize the probability of misclassification...

Descripción completa

Detalles Bibliográficos
Autores principales:	Radhakrishnan, Adityanarayanan, Belkin, Mikhail, Uhler, Caroline
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	National Academy of Sciences 2023
Materias:	Physical Sciences
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10083596/ https://www.ncbi.nlm.nih.gov/pubmed/36996114 http://dx.doi.org/10.1073/pnas.2208779120

_version_	1785021556864319488
author	Radhakrishnan, Adityanarayanan Belkin, Mikhail Uhler, Caroline
author_facet	Radhakrishnan, Adityanarayanan Belkin, Mikhail Uhler, Caroline
author_sort	Radhakrishnan, Adityanarayanan
collection	PubMed
description	While neural networks are used for classification tasks across domains, a long-standing open problem in machine learning is determining whether neural networks trained using standard procedures are consistent for classification, i.e., whether such models minimize the probability of misclassification for arbitrary data distributions. In this work, we identify and construct an explicit set of neural network classifiers that are consistent. Since effective neural networks in practice are typically both wide and deep, we analyze infinitely wide networks that are also infinitely deep. In particular, using the recent connection between infinitely wide neural networks and neural tangent kernels, we provide explicit activation functions that can be used to construct networks that achieve consistency. Interestingly, these activation functions are simple and easy to implement, yet differ from commonly used activations such as ReLU or sigmoid. More generally, we create a taxonomy of infinitely wide and deep networks and show that these models implement one of three well-known classifiers depending on the activation function used: 1) 1-nearest neighbor (model predictions are given by the label of the nearest training example); 2) majority vote (model predictions are given by the label of the class with the greatest representation in the training set); or 3) singular kernel classifiers (a set of classifiers containing those that achieve consistency). Our results highlight the benefit of using deep networks for classification tasks, in contrast to regression tasks, where excessive depth is harmful.
format	Online Article Text
id	pubmed-10083596
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	National Academy of Sciences
record_format	MEDLINE/PubMed
spelling	pubmed-100835962023-09-30 Wide and deep neural networks achieve consistency for classification Radhakrishnan, Adityanarayanan Belkin, Mikhail Uhler, Caroline Proc Natl Acad Sci U S A Physical Sciences While neural networks are used for classification tasks across domains, a long-standing open problem in machine learning is determining whether neural networks trained using standard procedures are consistent for classification, i.e., whether such models minimize the probability of misclassification for arbitrary data distributions. In this work, we identify and construct an explicit set of neural network classifiers that are consistent. Since effective neural networks in practice are typically both wide and deep, we analyze infinitely wide networks that are also infinitely deep. In particular, using the recent connection between infinitely wide neural networks and neural tangent kernels, we provide explicit activation functions that can be used to construct networks that achieve consistency. Interestingly, these activation functions are simple and easy to implement, yet differ from commonly used activations such as ReLU or sigmoid. More generally, we create a taxonomy of infinitely wide and deep networks and show that these models implement one of three well-known classifiers depending on the activation function used: 1) 1-nearest neighbor (model predictions are given by the label of the nearest training example); 2) majority vote (model predictions are given by the label of the class with the greatest representation in the training set); or 3) singular kernel classifiers (a set of classifiers containing those that achieve consistency). Our results highlight the benefit of using deep networks for classification tasks, in contrast to regression tasks, where excessive depth is harmful. National Academy of Sciences 2023-03-30 2023-04-04 /pmc/articles/PMC10083596/ /pubmed/36996114 http://dx.doi.org/10.1073/pnas.2208779120 Text en Copyright © 2023 the Author(s). Published by PNAS. https://creativecommons.org/licenses/by-nc-nd/4.0/This article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle	Physical Sciences Radhakrishnan, Adityanarayanan Belkin, Mikhail Uhler, Caroline Wide and deep neural networks achieve consistency for classification
title	Wide and deep neural networks achieve consistency for classification
title_full	Wide and deep neural networks achieve consistency for classification
title_fullStr	Wide and deep neural networks achieve consistency for classification
title_full_unstemmed	Wide and deep neural networks achieve consistency for classification
title_short	Wide and deep neural networks achieve consistency for classification
title_sort	wide and deep neural networks achieve consistency for classification
topic	Physical Sciences
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10083596/ https://www.ncbi.nlm.nih.gov/pubmed/36996114 http://dx.doi.org/10.1073/pnas.2208779120
work_keys_str_mv	AT radhakrishnanadityanarayanan wideanddeepneuralnetworksachieveconsistencyforclassification AT belkinmikhail wideanddeepneuralnetworksachieveconsistencyforclassification AT uhlercaroline wideanddeepneuralnetworksachieveconsistencyforclassification

Wide and deep neural networks achieve consistency for classification

Ejemplares similares