Cargando…
Computing Leapfrog Regularization Paths with Applications to Large-Scale K-mer Logistic Regression
High-dimensional statistics deals with statistical inference when the number of parameters or features p exceeds the number of observations n (i.e., [Formula: see text]). In this case, the parameter space must be constrained either by regularization or by selecting a small subset of [Formula: see te...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Mary Ann Liebert, Inc., publishers
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8219187/ https://www.ncbi.nlm.nih.gov/pubmed/33739865 http://dx.doi.org/10.1089/cmb.2020.0284 |
Sumario: | High-dimensional statistics deals with statistical inference when the number of parameters or features p exceeds the number of observations n (i.e., [Formula: see text]). In this case, the parameter space must be constrained either by regularization or by selecting a small subset of [Formula: see text] features. Feature selection through [Formula: see text]-regularization combines the benefits of both approaches and has proven to yield good results in practice. However, the functional relation between the regularization strength [Formula: see text] and the number of selected features m is difficult to determine. Hence, parameters are typically estimated for all possible regularization strengths [Formula: see text]. These so-called regularization paths can be expensive to compute and most solutions may not even be of interest to the problem at hand. As an alternative, an algorithm is proposed that determines the [Formula: see text]-regularization strength [Formula: see text] iteratively for a fixed m. The algorithm can be used to compute leapfrog regularization paths by subsequently increasing m. |
---|