Erwan Le Pennec

$\ell^1$ penalization

joint work with K. Bertin and V. Rivoirard

We consider here an example of $\ell^1$ regularization for density estimation: the adaptive Dantzig estimate. We assume that the unknown density $f$ belongs to $L^2$ et one tries to estimate it by a linear combination $$ f_{\lambda} = \sum_{k=1}^p \lambda_k \phi_k $$ where $\{ \phi_k \}_{1 \leq k \leq p}$ is a dictionary of functions in $L^2\cup L^{\infty}$. From the observations $X_1,\ldots,X_n$, one computes empirical scalar products $$ \widehat{\beta}_k= \frac{1}{n} \sum_{i=1}^n \phi_k(X_i). $$ and associated precisions $\eta_k$. The Dantzig estimate $f_{\widehat{\lambda}}$ is defined through the vector $\widehat{\lambda}$: $$ \widehat{\lambda} = \mathop{\mathrm{argmin}} \|\lambda \|_1 \quad\text{under}\quad \forall 1 \leq k \leq p, |\langle f_{\lambda}, \phi_k \rangle- \widehat{\beta}_k| \leq \eta_k. $$

In this article, we show that $\eta_k$ can be chosen essentially as $$ \eta_k = \sqrt{2\gamma\log p} \frac{ \widehat{\sigma}_k}{\sqrt{n}} + \frac{7}{3} \gamma \log p \frac{\|\phi_k\|_{\infty}}{n} $$ where $\widehat{\sigma}^2_k$ is a natural estimate of the variance of $\widehat{\beta}_k$. We explain also the links between this estimate and the Laso estimate. We also show that our penalty is well calibrated: if the parameter $\gamma$ is chosen larger than $1$ then, under mild assumptions on the dictionary, oracle inequalities can be obtained whild if it is chosen smaller than $1$ there are always cases when this estimate could not satifies those oracle inequalities.

By default, only communications in english are shown. Using the language filter on the right side, you can also see the french ones.