Cargando…

Dynamics of stochastic gradient descent for two-layer neural networks in the teacher–student setup

Deep neural networks achieve stellar generalisation even when they have enough parameters to easily fit all their training data. We study this phenomenon by analysing the dynamics and the performance of over-parameterised two-layer neural networks in the teacher–student setup, where one network, the...

Descripción completa

Detalles Bibliográficos
Autores principales: Goldt, Sebastian, Advani, Madhu S, Saxe, Andrew M, Krzakala, Florent, Zdeborová, Lenka
Formato: Online Artículo Texto
Lenguaje:English
Publicado: IOP Publishing and SISSA 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8252911/
https://www.ncbi.nlm.nih.gov/pubmed/34262607
http://dx.doi.org/10.1088/1742-5468/abc61e