<< Chapter < Page | Chapter >> Page > |
In Lecture 4 and 15 , we investigated the problem of denoising a smooth signal in additive white noise. In Lecture 4 , we considered Lipschitz functions and showed that by filling constants on auniform partition of width we can achieve an rate of MSE convergence.
In Lecture 15 , we considered Holder- smooth functions, and we demonstrated that by automatically selecting partition widthand using polynomial fits we can obtain a MSE convergence rate of , substantially better when . Also important is the fact that we don't need to know the value of a priori. The estimator is fundamentally different than its counterpart in Lecture 4 .
In both cases is a linear function (polynomial on constant fit) of the data in each interval of the underlyingpartition. In Lecture 4 , the partition was independent of the data, and so the overall estimator is a linear function ofthe data .
However, in Lecture 15 the partition itself was selected based on the data. Consequently, is a non-linear function of the data. Linear estimators (linear functions of the data) cannot adapt to unknown degrees of smoothness. In this lecture, welay the groundwork for one more important extension in the denoising application - spatial adaptivity. That is, we would liketo construct estimators that not only adapt to unknown degrees of global smoothness, but that also adapt to spatially varyingdegrees of smoothness.
We will focus on the approximation theoretic aspects of the problem in this lecture, considering tree-based approximations andwavelet expansions. In the next lecture , we will apply these results to the denoising problem, this will bring us up to datewith the current state-of-the-art in denoising and non-parametric estimation.
Recall that Holder spaces contain smooth functions that are well approximated with polynomials or piecewise polynomial functions.Holder spaces are quite large and contain many interesting signals. However, Holder spaces are still inadequate in manyapplications. Often, we encounter functions that are not smooth everywhere; they contain discontinuities, jumps, spikes, etc.Indeed, the "singularities" (or non-smooth points) can be the most interesting and informative aspects of the functions.
Functions not smooth everywhere.
Furthermore, functions of interest may possess different degrees of smoothness in different regions.
Functions with different degrees of smoothness.
Let denote the set of all functions that are everywhere except on a set of measure zero. To simplify the notation, we won't explicitly identify the domain(e.g., or ); that will be clear from the context.
Let's consider a 1-D case first.
Let and consider approximating by a piecewise polynomial function on a uniform partition.
If is Holder- smooth everywhere, then by using an appropriate partition width and fitting degree polynomials on each interval we have an approximation satisfying
and
However, if there is a discontinuity then for in the interval containing the discontinuity the difference
will not be small.
Notification Switch
Would you like to follow the 'Statistical learning theory' conversation and receive update notifications?