Cargando…

Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training

In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable, optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this model is derived by isolating the topmost layer from the r...

Descripción completa

Detalles Bibliográficos
Autores principales:	Fang, Cong, He, Hangfeng, Long, Qi, Su, Weijie J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	National Academy of Sciences 2021
Materias:	Physical Sciences
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8639364/ https://www.ncbi.nlm.nih.gov/pubmed/34675075 http://dx.doi.org/10.1073/pnas.2103091118

_version_	1784609132748210176
author	Fang, Cong He, Hangfeng Long, Qi Su, Weijie J.
author_facet	Fang, Cong He, Hangfeng Long, Qi Su, Weijie J.
author_sort	Fang, Cong
collection	PubMed
description	In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable, optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this model is derived by isolating the topmost layer from the remainder of the neural network, followed by imposing certain constraints separately on the two parts of the network. We demonstrate that the Layer-Peeled Model, albeit simple, inherits many characteristics of well-trained neural networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep-learning training. First, when working on class-balanced datasets, we prove that any solution to this model forms a simplex equiangular tight frame, which, in part, explains the recently discovered phenomenon of neural collapse [V. Papyan, X. Y. Han, D. L. Donoho, Proc. Natl. Acad. Sci. U.S.A. 117, 24652–24663 (2020)]. More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto-unknown phenomenon that we term Minority Collapse, which fundamentally limits the performance of deep-learning models on the minority classes. In addition, we use the Layer-Peeled Model to gain insights into how to mitigate Minority Collapse. Interestingly, this phenomenon is first predicted by the Layer-Peeled Model before being confirmed by our computational experiments.
format	Online Article Text
id	pubmed-8639364
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	National Academy of Sciences
record_format	MEDLINE/PubMed
spelling	pubmed-86393642021-12-12 Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training Fang, Cong He, Hangfeng Long, Qi Su, Weijie J. Proc Natl Acad Sci U S A Physical Sciences In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable, optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this model is derived by isolating the topmost layer from the remainder of the neural network, followed by imposing certain constraints separately on the two parts of the network. We demonstrate that the Layer-Peeled Model, albeit simple, inherits many characteristics of well-trained neural networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep-learning training. First, when working on class-balanced datasets, we prove that any solution to this model forms a simplex equiangular tight frame, which, in part, explains the recently discovered phenomenon of neural collapse [V. Papyan, X. Y. Han, D. L. Donoho, Proc. Natl. Acad. Sci. U.S.A. 117, 24652–24663 (2020)]. More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto-unknown phenomenon that we term Minority Collapse, which fundamentally limits the performance of deep-learning models on the minority classes. In addition, we use the Layer-Peeled Model to gain insights into how to mitigate Minority Collapse. Interestingly, this phenomenon is first predicted by the Layer-Peeled Model before being confirmed by our computational experiments. National Academy of Sciences 2021-10-20 2021-10-26 /pmc/articles/PMC8639364/ /pubmed/34675075 http://dx.doi.org/10.1073/pnas.2103091118 Text en Copyright © 2021 the Author(s). Published by PNAS. https://creativecommons.org/licenses/by-nc-nd/4.0/This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle	Physical Sciences Fang, Cong He, Hangfeng Long, Qi Su, Weijie J. Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training
title	Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training
title_full	Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training
title_fullStr	Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training
title_full_unstemmed	Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training
title_short	Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training
title_sort	exploring deep neural networks via layer-peeled model: minority collapse in imbalanced training
topic	Physical Sciences
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8639364/ https://www.ncbi.nlm.nih.gov/pubmed/34675075 http://dx.doi.org/10.1073/pnas.2103091118
work_keys_str_mv	AT fangcong exploringdeepneuralnetworksvialayerpeeledmodelminoritycollapseinimbalancedtraining AT hehangfeng exploringdeepneuralnetworksvialayerpeeledmodelminoritycollapseinimbalancedtraining AT longqi exploringdeepneuralnetworksvialayerpeeledmodelminoritycollapseinimbalancedtraining AT suweijiej exploringdeepneuralnetworksvialayerpeeledmodelminoritycollapseinimbalancedtraining

Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training

Ejemplares similares