<< Chapter < Page | Chapter >> Page > |
So let's see. [Inaudible] All right. So let's take those and apply this to our optimal margin optimization problem that we had previously. I was gonna say one word about this, which is – was gonna say one word about this KTT complementary condition is that a condition that is a – at your solution, you must have that alpha star i times gi of w is equal to 0.
So let's see. So the product of two numbers is equal to 0. That means that at least one of these things must be equal to 0. For the product of two things to be equal to 0, well, that's just saying either alpha or i or gi is equal to 0.
So what that implies is that the – just Karush-Kuhn-Tucker – most people just say KKT, but we wanna show you the right spelling of their names. So KKT complementary condition implies that if alpha i is not 0, that necessarily implies that gi of w star is equal to 0. And usually, it turns out – so all the KKT condition guarantees is that at least one of them is 0. It may actually be the case that both alpha and gi are both equal to 0. But in practice, when you solve this optimization problem, you find that to a large part, alpha i star is not equal to 0 if and only gi of w star 0, 0.
This is not strictly true because it's possible that both of these may be 0. But in practice, when we – because when we solve problems like these, you're, for the most part, usually exactly one of these will be non-0.
And also, when this holds true, when gi of w star is equal to 0, we say that gi – gi of w, I guess, is an active constraint because we call a constraint – our constraint was a gi of w must be less than or equal to 0. And so it is equal to 0, then we say that that's a constraint that this is an active constraint.
Once we talk about [inaudible], we come back and [inaudible]and just extend this idea a little bit more. [Inaudible] board. [Inaudible]turn to this board in a second, but – so let's go back and work out one of the primal and the duo optimization problems for our optimal margin classifier for the optimization problem that we worked on just now.
As a point of notation, in whatever I've been writing down so far in deriving the KKT conditions, when Lagrange multipliers were alpha i and beta i, it turns out that when applied as [inaudible] dm, turns out we only have one set of Lagrange multipliers alpha i.
And also, as I was working out the KKT conditions, I used w to denote the parameters of my primal optimization problem. [Inaudible] I wanted to minimize f of w. In my very first optimization problem, I had that optimization problem [inaudible]finding the parameters w.
In my svn problem, I'm actually gonna have two sets of parameters, w and b. So this is just a – keep that sort of slight notation change in mind.
So problem we worked out previously was we want to minimize the normal w squared and just add a half there by convention because it makes other work – math work a little nicer. And subject to the constraint that yi x w [inaudible] xi + v must be = greater than 1.
And so let me just take this constraint, and I'll rewrite it as a constraint. It's gi of w, b. Again, previously, I had gi of w, but now I have parameters w and b. So gi of w, b defined as 1.
Notification Switch
Would you like to follow the 'Machine learning' conversation and receive update notifications?