Cargando…
Taming hyperparameter tuning in continuous normalizing flows using the JKO scheme
A normalizing flow (NF) is a mapping that transforms a chosen probability distribution to a normal distribution. Such flows are a common technique used for data generation and density estimation in machine learning and data science. The density estimate obtained with a NF requires a change of variab...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10024737/ https://www.ncbi.nlm.nih.gov/pubmed/36934141 http://dx.doi.org/10.1038/s41598-023-31521-y |
Sumario: | A normalizing flow (NF) is a mapping that transforms a chosen probability distribution to a normal distribution. Such flows are a common technique used for data generation and density estimation in machine learning and data science. The density estimate obtained with a NF requires a change of variables formula that involves the computation of the Jacobian determinant of the NF transformation. In order to tractably compute this determinant, continuous normalizing flows (CNF) estimate the mapping and its Jacobian determinant using a neural ODE. Optimal transport (OT) theory has been successfully used to assist in finding CNFs by formulating them as OT problems with a soft penalty for enforcing the standard normal distribution as a target measure. A drawback of OT-based CNFs is the addition of a hyperparameter, [Formula: see text] , that controls the strength of the soft penalty and requires significant tuning. We present JKO-Flow, an algorithm to solve OT-based CNF without the need of tuning [Formula: see text] . This is achieved by integrating the OT CNF framework into a Wasserstein gradient flow framework, also known as the JKO scheme. Instead of tuning [Formula: see text] , we repeatedly solve the optimization problem for a fixed [Formula: see text] effectively performing a JKO update with a time-step [Formula: see text] . Hence we obtain a ”divide and conquer” algorithm by repeatedly solving simpler problems instead of solving a potentially harder problem with large [Formula: see text] . |
---|