<< Chapter < Page Chapter >> Page >

Review of past lecture

In our past lectures we considered collections of candidate function F that were either finite or enumerable . We then constructed penalties, usually codelengths, for eachcandidate c ( f ) , f F , such that f F 2 c ( f ) 1 This allowed us to derive uniform concentration inequalities over the entire set F using the union bound. However, in many cases the collections F may be uncountably infinite. A simple example is the collection F of a single threshold classifier in 1-d having the form

f t ( x ) = 1 { x t }

and their complements

f s ( t ) = 1 { x < s } .

Thus, F contains an uncountable number of classifiers, and we cannot apply the union bound argument in such cases.

Two ways to proceed

Discretize or quantize the collection

To quantize F

F q = f , f ( x ) = 1 { x 1 / q i i { 0 , 1 , ... , q } }

q is positive, such that f q F q

| f - f q | c / q

if the density of x is bounded by c > 0 . q < n 1 / 2 .

Identical empirical errors

Consider the fact that given only n training data, many of the classifiers in such a collection may produce identical empiricalerrors. Also, many f F will produce identical label assignments on the data. We will have at most 2 n unique labels.

f is uncountable, its interceptions are countable and bounded by 2 n . n intervals with 2 classifier per interval.

The number of distinct labeling assignments that a class F can produce on a set of n points is denoted

S ( F , n ) 2 n

The VC dimension is l o g S ( F , n ) . Specifically, V C ( F ) = k , where k is largest integer such that S ( F , k ) = 2 k Ex. 2 n = 2 n , n = 2 , V C ( F ) = 2 .

Ex. Consider

F = f : f ( x ) = 1 { x t } o r f ( x ) = 1 { x < t } , t [ 0 , 1 ]

Let q be a positive integer and

F q = f : f ( x ) = 1 { x i / q } o r f ( x ) = 1 { x < i / q } , i { 0 , 1 , ... , q }

and,

| f q | = 2 ( q + 1 ) .

Moreover, for any f F there exists an f 1 F q such that

| f ( x ) - f q ( x ) | d x ( i - 1 ) / q i / q 1 d x = 1 / q .

Now suppose we have n training data and suppose f * F . We know that in general, the minimum empirical risk classifier willconverge to the Bayes classifier at the rate of n - 1 / 2 or slower. Therefore, it is unnecessary to drive the approximationerror down faster than n - 1 / 2 So, we can restrict our attention of F n - 1 / 2 and, provided that the density of x is bound above. We have

m i n f F n - 1 / 2 R ( f ) - R ( f * ) C f q m i n | f * ( x ) - f ( x ) | d x c / n 1 / 2 .

Vapnik-Chervonenkis theory is based not on explicitly quantizing the collection of candidate functions, but rather on recognizingthat the richness of F is limited in a certain sense by the number of training data. Indeed, given n i.i.d. training data,there are at most 2 n different binary labelings. Therefore, any collection F may be divided into 2 n subsets of classifiers that are "equilvalent" with respect to the training data. In manycases a collection may not even be capable of producing 2 n different labellings.

Example

Consider X = [ 0 , 1 ] .

F = f : f ( x ) = 1 { x t } o r f ( x ) = 1 { x < t } t [ 0 , 1 ]

Suppose we have n training data: ( x 1 , ... , x n ) [ 0 , 1 ] . With x s denotes the location of each training point in [0,1].Associated with each x is a label y { 0 , 1 } . Any classifier in F will label all points to the left of a number t [ 0 , 1 ] as "1" or "0", and points to the right as "0" or "1", respectively.For t [ 0 , x 1 ) , all points are either labelled "0" or "1". For t ( x 1 , x 2 ) , x 1 is labelled "0" or "1" and x 2 ... x n are label "1" or "0" and so on. We see that there are exactly 2 n different labellings; far less than 2 n !

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Statistical learning theory. OpenStax CNX. Apr 10, 2009 Download for free at http://cnx.org/content/col10532/1.3
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Statistical learning theory' conversation and receive update notifications?

Ask