<< Chapter < Page | Chapter >> Page > |
As with the above techniques in manifold learning, the Johnson-Lindenstrauss (JL)lemma [link] , [link] , [link] , [link] provides a method for dimensionality reduction of a set of data in . Unlike manifold-based methods, however, the JL lemma can be used for any arbitrary set of points in ; the data set is not assumed to have any a priori structure.
Despite the apparent lack of structure in an arbitrary point cloud data set, the JL lemma suggests that there does exist a method for dimensionality reduction of that data set that can preserve key information while mapping the data to a lower-dimensional space . In particular, the original formulation of the JL lemma [link] states that there exists a Lipschitz mapping with such that all pairwise distances between points in are approximately preserved. This fact is useful for solving problemssuch as Approximate Nearest Neighbor [link] , in which one desires the nearest point in to some query point (but a solution not much further than the optimal point is also acceptable). Such problems can be solvedsignificantly more quickly in than in .
Recent reformulations of the JL lemma propose random linear operators that, with high probability, will ensure a nearisometric embedding. These typically build on concentration of measure results such as the following.
Lemma[link] , [link] Let , fix , and let be a matrix constructed in one of the following two manners:
Then with probability exceeding
the following holds:
The random orthoprojector referred to above is clearly related to the first case (simple matrix multiplication by a Gaussian ) but subtly different; one could think of constructing a randomGaussian , then using Gram-Schmidt to orthonormalize the rows before multiplying . We note also that simple rescaling of can be used to eliminate the in [link] ; however we prefer this formulation for later reference.
By using the union bound over all pairs of distinct points in , Lemma "The Johnson-Lindenstrauss lemma" can be used to prove a randomized version of the Johnson-Lindenstrauss lemma.
LemmaLet be a finite collection of points in . Fix and . Set
Let be a matrix constructed in one of the following two manners:
Then with probability exceeding , the following statement holds: for every ,
Indeed, [link] establishes that both [link] and [link] also hold when the elements of are chosen i.i.d. from a random Rademacher distribution ( with equal probability ) or from a similar ternary distribution ( with equal probability ; 0 with probability ). These can further improve the computational benefits of the JL lemma.
In the following module on Compressed Sensing we will discuss further topics in dimensionality reduction that relate to the JL lemma. In particular, as discussed in Connections with dimensionality reduction , the core mechanics of Compressed Sensing can be interpreted in terms of a stable embedding that arises for the family of -sparse signals when observed with random measurements, and this stable embedding can be proved using the JL lemma. Furthermore, as discussed in Stable embeddings of manifolds , one can ensure a stable embedding of families of signals obeying manifold models under a sufficient number of random projections, with the theory again following from the JL lemma.
Notification Switch
Would you like to follow the 'Concise signal models' conversation and receive update notifications?