<< Chapter < Page | Chapter >> Page > |
Student: For the second statement, where we're saying the data of the functional margin is divided [inaudible].
Instructor (Andrew Ng) :Oh, I see, yes.
Student: [Inaudible] is that [inaudible]?
Instructor (Andrew Ng) :So let's see, this is the function margin, right? This is not the geometric margin.
Student: Yeah.
Instructor (Andrew Ng) :So – oh, I want to divide by the normal w of my optimization objective.
Student: I'm just wondering how come you end up dividing also under the second stage [inaudible] the functional margin. Why are you dividing there by the normal w?
Instructor (Andrew Ng) :Let's see. I'm not sure I get the question. Let me try saying this again. So here's my goal. My – I want [inaudible]. So let's see, the parameters of this optimization problem where gamma hat w and b – so the convex optimization software solves this problem for some set of parameters gamma w and b. And I'm imposing the constraint that whatever values it comes up with, yi x [inaudible]x5 + b must be greater than gamma hat. And so this means that the functional margin of every example had better be greater than equal to gamma hat. So there's a constraint to the function margin and a constraint to the gamma hat.
But what I care about is not really maximizing the functional margin. What I really care about – in other words, in optimization objective, is maximizing gamma hat divided by the normal w, which is the geometric margin.
So in other words, my optimization [inaudible] is I want to maximize the function margin divided by the normal w. Subject to that, every example must have function margin and at least gamma hat. Does that make sense now?
Student: [Inaudible] when you said that to maximize gamma or gamma hat, respect to gamma w and with respect to gamma hat so that [inaudible]gamma hat are no longer [inaudible]?
Instructor (Andrew Ng) :So this is the – so it turns out – so this is how I write down the – this is how I write down an optimization problem in order to solve for the geometric margin. What is it – so it turns out that the question of this – is the gamma hat the function of w and b?
And it turns out that in my previous mathematical definition, it was, but the way I'm going to pose this as an optimization problem is I'm going to ask the convex optimization solvers – and this [inaudible] software – unless you have software for solving convex optimization problems – hen I'm going to pretend that these are independent variables and ask my convex optimization software to find me values for gamma, w, and b, to make this value as big as possible and subject to this constraint.
And it'll turn out that when it does that, it will choose – or obviously, it will choose for gamma to be as big as possible because optimization objective is this: You're trying to maximize gamma hat.
So for x value of w and b, my software, which choose to make gamma hat as big as possible – well, but how big can we make gamma hat? Well, it's limited by use constraints. It says that every training example must have function margin greater than equal to gamma hat. And so my – the bigger you can make gamma hat will be the value of the smallest functional margin. And so when you solve this optimization problem, the value of gamma hat you get out will be, indeed, the minimum of the functional margins of your training set.
Notification Switch
Would you like to follow the 'Machine learning' conversation and receive update notifications?