<< Chapter < Page Chapter >> Page >

And one more definition, I’m going to say that the functional margin of a hyper plane with respect to an entire training set is going to define gamma hat to be equal to min over all your training examples of gamma hat, I, right? So if you have a training set, if you have just more than one training example, I’m going to define the functional margin with respect to the entire training set as the worst case of all of your functional margins of the entire training set. And so for now we should think of the first function like an intuition of saying that we would like the function margin to be large, and for our purposes, for now, let’s just say we would like the worst-case functional margin to be large, okay? And we’ll change this a little bit later as well.

Now, it turns out that there’s one little problem with this intuition that will, sort of, edge us later, which it actually turns out to be very easy to make the functional margin large, all right? So, for example, so as I have a classifiable parameters W and B. If I take W and multiply it by two and take B and multiply it by two, then if you refer to the definition of the functional margin, I guess that was what? Gamma I, gamma hat I equals YI times W times transpose B. If I double W and B, then I can easily double my functional margin.

So this goal of making the functional margin large, in and of itself, isn’t so useful because it’s easy to make the functional margin arbitrarily large just by scaling other parameters. And so maybe one thing we need to do later is add a normalization condition. For example, maybe we want to add a normalization condition that de-norm, the alter-norm of the parameter W is equal to one, and we’ll come back to this in a second. All right. And then so –

Okay. Now, let’s talk about – see how much time we have, 15 minutes. Well, see, I’m trying to decide how much to try to do in the last 15 minutes. Okay. So let’s talk about the geometric margin, and so the geometric margin of a training example – [inaudible], right? So the division boundary of my classifier is going to be given by the plane W transpose X plus B is equal to zero, okay? Right, and these are the X1, X2 axis, say, and we’re going to draw relatively few training examples here. Let’s say I’m drawing deliberately few training examples so that I can add things to this, okay?

And so assuming we classified an example correctly, I’m going to define the geometric margin as just a geometric distance between a point between the training example – yeah, between the training example XI, YI and the distance given by this separating line, given by this separating hyper plane, okay? That’s what I’m going to define the geometric margin to be.

And so I’m gonna do some algebra fairly quickly. In case it doesn’t make sense, and read through the lecture notes more carefully for details. Sort of, by standard geometry, the normal, or in other words, the vector that’s 90 degrees to the separating hyper plane is going to be given by W divided by the norm of W; that’s just how planes and high dimensions work. If this stuff – some of this you have to use, take a look t the lecture notes on the website.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Machine learning. OpenStax CNX. Oct 14, 2013 Download for free at http://cnx.org/content/col11500/1.4
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Machine learning' conversation and receive update notifications?

Ask