<< Chapter < Page | Chapter >> Page > |
And so a couple of easy facts, one is if the norm of W is equal to one, then the functional margin is equal to the geometric margin, and you see that quite easily, and, more generally, the geometric margin is just equal to the functional margin divided by the norm of W, okay? Let’s see, okay. And so one final definition is so far I’ve defined the geometric margin with respect to a single training example, and so as before, I’ll define the geometric margin with respect to an entire training set as gamma equals min over I of gamma I, all right?
And so the maximum margin classifier, which is a precursor to the support vector machine, is the learning algorithm that chooses the parameters W and B so as to maximize the geometric margin, and so I just write that down. The maximum margin classified poses the following optimization problem. It says choose gamma, W, and B so as to maximize the geometric margin, subject to that YI times – well, this is just one way to write it, subject to – actually, do I write it like that? Yeah, fine. There are several ways to write this, and one of the things we’ll do next time is actually – I’m trying to figure out if I can do this in five minutes. I’m guessing this could be difficult.
Well, so this maximizing your classifier is the maximization problem over parameter gamma W and B, and for now, it turns out that the geometric margin doesn’t change depending on the norm of W, right? Because in the definition of the geometric margin, notice that we’re dividing by the norm of W anyway. So you can actually set the norm of W to be anything you want, and you can multiply W and B by any constant; it doesn’t change the geometric margin. This will actually be important, and we’ll come back to this later. Notice that you can take the parameters WB, and you can impose any normalization constant to it, or you can change W and B by any scaling factor and replace them by ten W and ten B whatever, and it does not change the geometric margin, okay?
And so in this first formulation, I’m just gonna impose a constraint and say that the norm of W was one, and so the function of the geometric margins will be the same, and then we’ll say maximize the geometric margins subject to – you maximize gamma subject to that every training example must have geometric margin at least gamma, and this is a geometric margin because when the norm of W is equal to one, then the functional of the geometric margin are identical, okay?
So this is the maximum margin classifier, and it turns out that if you do this, it’ll run, you know, maybe about as well as a – maybe slight – maybe comparable to logistic regression, but it turns out that as we develop this algorithm further, there will be a clever way to allow us to change this algorithm to let it work in infinite dimensional feature spaces and come up with very efficient non-linear classifiers. So there’s a ways to go before we turn this into a support vector machine, but this is the first step. So are there questions about this? Yeah.
Student: [Off mic].
Instructor (Andrew Ng) :For now, let’s just say you’re given a fixed training set, and you can’t – yeah, for now, let’s just say you’re given a fixed training set, and the scaling of the training set is not something you get to play with, right? So everything I’ve said is for a fixed training set, so that you can’t change the X’s, and you can’t change the Y’s. Are there other questions?
Okay. So all right. Next week we will take this, and we’ll talk about authorization algorithms, and work our way towards turning this into one of the most effective off-the-shelf learning algorithms, and just a final reminder again, this next discussion session will be on Matlab and Octaves. So show up for that if you want to see a tutorial. Okay. See you guys in the next class.
[End of Audio]
Duration: 74 minutes
Notification Switch
Would you like to follow the 'Machine learning' conversation and receive update notifications?