<< Chapter < Page Chapter >> Page >

This implies that for any δ > 0 with probability at least 1 - δ we have f F

R ( f ) R ^ n ( f ) + log 1 δ ( f ) 2 n = R ^ n ( f ) + c ( f ) log 2 + log 1 δ 2 n .

Application

Let F 1 , F 1 , ... be a sequence of finite sets of candidate functions with | F 1 | < | F 1 | < . . . We can design prefix codes as follows. Use the codes 0, 10, 110, 1110, ... to encode thesubscript i in | F i | . For each class | F i | , construct a set of binary codewords of length log 2 | F | to uniquely encode each function in F i . Then, encode any given function f by first using the code for i corresponding to the smallest F i that f belongs to, followed by the length log 2 | F | codeword for f F i . This is a prefix code.

Histogram classifiers

X=[0,1] d , Y={0,1}. Let F k , k=1, 2, ... denote the collection of histogram classification rules with k equal volumebins. We can use the following codebook for the index k.

And follow this codeword with k = log 2 | F k | bits to indicate which of the 2 k possible histogram rules is under consideration. Thus for any f F k for some k 1 there is a prefix code of length

c ( f ) = k + k = 2 k b i t s .

It follows that for any δ > 0 with probability at least 1 - δ we have f k 1 F k

R ( f ) R ^ n ( f ) + 2 k f log 2 + log 1 δ 2 n

where k f is the number of bins in histogram corresponding to f . Contrast with the bound we had for the class of m bin histograms alone: with probability 1 - δ , f F m

R ( f ) R ^ n ( f ) + m log 2 + log 1 δ ( f ) 2 n .

Notice the bound for all histograms rules is almost as good as the bound for only the m -bin rules. That is, when k f = m the bounds are within a factor of 2 . On the other hand, the new bound is a big improvement, since it also gives us a guide for selecting thenumber of bins.

Proof

Proof of the kraft inequality

We will prove that for any binary prefix code, the codeword lengths c 1 , c 2 , ... satisfy k 1 2 - c k 1 . The converse is easy to prove also, but it not central to ourpurposes here (for a proof, see Cover & Thomas '91). Consider a binary tree like the one shown below.

The sequence of bit values leading from the root to a leaf of the tree represents a codeword. The prefix condition implies that nocodeword is a descendant of any other codeword in the tree. Let c m a x be the length of the longest codeword (also the number of branches to the deepest leaf) in the tree.

Consider a leaf i in the tree at level c i . This leaf would have 2 c m a x - c i descendants at level c m a x . Furthermore, for each leaf the set of possible descendants at level c m a x is disjoint (since no codeword can be a prefix of another). Therefore,since the total number of possible leafs at level c m a x is 2 c m a x , we have

i leafs 2 c m a x - c i 2 c m a x i leafs 2 - c i 1

which proves the case when the number of codewords is finite.

Suppose now that we have a countably infinite number of codewords. Let b 1 b 2 ... b c i be the ith codeword and let

r i = j = i c i b j 2 - j

be the real number corresponding to the binary expansion of the codeword. We canassociate the interval [ r i , r i + 2 - c i ) with the ith codeword. This is the set of all real numbers whose binaryexpansion begins with b 1 b 2 ... b c i . Since this is a subinterval of [ 0 , 1 ] , and all such subintervals corresponding to prefix codewords are disjoint, the sum of their lengths must beless than or equal to 1. This proves the case where the number of codewords is infinite.

Questions & Answers

if three forces F1.f2 .f3 act at a point on a Cartesian plane in the daigram .....so if the question says write down the x and y components ..... I really don't understand
Syamthanda Reply
hey , can you please explain oxidation reaction & redox ?
Boitumelo Reply
hey , can you please explain oxidation reaction and redox ?
Boitumelo
for grade 12 or grade 11?
Sibulele
the value of V1 and V2
Tumelo Reply
advantages of electrons in a circuit
Rethabile Reply
we're do you find electromagnetism past papers
Ntombifuthi
what a normal force
Tholulwazi Reply
it is the force or component of the force that the surface exert on an object incontact with it and which acts perpendicular to the surface
Sihle
what is physics?
Petrus Reply
what is the half reaction of Potassium and chlorine
Anna Reply
how to calculate coefficient of static friction
Lisa Reply
how to calculate static friction
Lisa
How to calculate a current
Tumelo
how to calculate the magnitude of horizontal component of the applied force
Mogano
How to calculate force
Monambi
a structure of a thermocouple used to measure inner temperature
Anna Reply
a fixed gas of a mass is held at standard pressure temperature of 15 degrees Celsius .Calculate the temperature of the gas in Celsius if the pressure is changed to 2×10 to the power 4
Amahle Reply
How is energy being used in bonding?
Raymond Reply
what is acceleration
Syamthanda Reply
a rate of change in velocity of an object whith respect to time
Khuthadzo
how can we find the moment of torque of a circular object
Kidist
Acceleration is a rate of change in velocity.
Justice
t =r×f
Khuthadzo
how to calculate tension by substitution
Precious Reply
hi
Shongi
hi
Leago
use fnet method. how many obects are being calculated ?
Khuthadzo
khuthadzo hii
Hulisani
how to calculate acceleration and tension force
Lungile Reply
you use Fnet equals ma , newtoms second law formula
Masego
please help me with vectors in two dimensions
Mulaudzi Reply
how to calculate normal force
Mulaudzi
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Statistical learning theory. OpenStax CNX. Apr 10, 2009 Download for free at http://cnx.org/content/col10532/1.3
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Statistical learning theory' conversation and receive update notifications?

Ask