<< Chapter < Page | Chapter >> Page > |
And so let’s say this distance is gamma I, okay? And so I’m going to use the convention that I’ll put a hat on top where I’m referring to functional margins, and no hat on top for geometric margins. So let’s say geometric margin, as this example, is gamma I. That means that this point here, right, is going to be XI minus gamma I times W over normal W, okay? Because W over normal W is the unit vector, is the length one vector that is normal to the separating hyper plane, and so when we subtract gamma I times the unit vector from this point, XI, or at this point here is XI. So XI minus, you know, this little vector here is going to be this point that I’ve drawn as a heavy circle, okay? So this heavy point here is XI minus this vector, and this vector is gamma I time W over norm of W, okay?
And so because this heavy point is on the separating hyper plane, right, this point must satisfy W transpose times that point equals zero, right? Because all points X on the separating hyper plane satisfy the equation W transpose X plus B equals zero, and so this point is on the separating hyper plane, therefore, it must satisfy W transpose this point – oh, excuse me. Plus B is equal to zero, okay? Raise your hand if this makes sense so far? Oh, okay. Cool, most of you, but, again, I’m, sort of, being slightly fast in this geometry. So if you’re not quite sure why this is a normal vector, or how I subtracted this, or whatever, take a look at the details in the lecture notes.
And so what I’m going to do is I’ll just take this equation, and I’ll solve for gamma, right? So this equation I just wrote down, solve this equation for gamma or gamma I, and you find that – you saw that previous equation from gamma I – well, why don’t I just do it? You have W transpose XI plus B equals gamma I times W transpose W over norm of W; that’s just equal to gamma times the norm of W because W transpose W is the norm of W squared, and, therefore, gamma is just – well, transpose X equals, okay? And, in other words, this little calculation just showed us that if you have a training example XI, then the distance between XI and the separating hyper plane defined by the parameters W and B can be computed by this formula, okay?
So the last thing I want to do is actually take into account the sign of the – the correct classification of the training example. So I’ve been assuming that we’ve been classifying an example correctly. So, more generally, to find the geometric margin of a training example to be gamma I equals YI times that thing on top, okay? And so this is very similar to the functional margin, except for the normalization by the norm of W, and so as before, you know, this says that so long as – we would like the geometric margin to be large, and all that means is that so long as we’re classifying the example correctly, we would ideally hope of the example to be as far as possible from the separating hyper plane, so long as it’s on the right side of the separating hyper plane, and that’s what YI multiplied into this does.
Notification Switch
Would you like to follow the 'Machine learning' conversation and receive update notifications?