<< Chapter < Page | Chapter >> Page > |
And again, using my vector notation, I'll write this as g of w is equal to 0. And h of w is equal to 0. So in [inaudible]'s case, we now have inequality for constraint as well as equality constraint.
I then have a Lagrange, or it's actually still – called – say generalized Lagrange, which is now a function of my original optimization for parameters w, as well as two sets of Lagrange multipliers, alpha and beta. And so this will be f of w.
Now, here's a cool part. I'm going to define theta subscript p of w to be equal to max of alpha beta subject to the constraints that the alphas are, beta equal to 0 of the Lagrange.
And so I want you to consider the optimization problem min over w of max over alpha beta, such that the alpha is a greater than 0 of the Lagrange. And that's just equal to min over w, theta p of w.
And just to give us a name, the [inaudible] – the subscript p here is a sense of primal problem. And that refers to this entire thing. This optimization problem that written down is called a primal problem. This means there's the original optimization problem in which [inaudible]solving. And later on, I'll derive in another version of this, but that's what p stands for. It's a – this is a primal problem.
Now, I want you to look at – consider theta over p again. And in particular, I wanna consider the problem of what happens if you minimize w – minimize as a function of w this quantity theta over p. So let's look at what theta p of w is. Notice that if gi of w is greater than 0, so let's pick the value of w. And let's ask what is the state of p of w? So if w violates one of your primal problems constraints, then state of p of w would be infinity. Why is that?
[Inaudible] p [inaudible]second. Suppose I pick a value of w that violates one of these constraints. So gi of w is positive. Then – well, theta p is this – maximize this function of alpha and beta – the Lagrange. So one of these gi of w's is this positive, then by setting the other responding alpha i to plus infinity, I can make this arbitrarily large. And so if w violates one of my primal problem's constraints in one of the gis, then max over alpha of this Lagrange will be plus infinity.
There's some of – and in the same way – I guess in a similar way, if hi of w is not equal to 0, then theta p of w also be infinity for a very similar reason because if hi of w is not equal to 0 for some value of i, then in my Lagrange, I had a beta i x hi theorem. And so by setting beta i to be plus infinity or minus infinity depending on the sign of hi, I can make this plus infinity as well.
And otherwise, theta p of w is just equal to f of w. Turns out if I had a value of w that satisfies all of the gi and the hi constraints, then we maximize in terms of alpha and beta – all the Lagrange multiply theorems will actually be obtained by setting all the Lagrange multiply theorems to be 0, and so theta p just left with f of w.
Thus, theta p of w is equal to f of w if constraints are satisfied [inaudible] the gi in hi constraints, and is equal to plus infinity otherwise.
Notification Switch
Would you like to follow the 'Machine learning' conversation and receive update notifications?