<< Chapter < Page | Chapter >> Page > |
And so let's work out what the duo problem is. And to do that, I need to figure out what theta d of alpha – and I know again, beta's there – so what theta d of alpha is min with respect to wb of lb alpha. So the duo problem is the maximize theta d as the function of alpha. So as to work out what theta d is, and then that'll give us our duo problem.
So then to work out what this is, what do you need to do? We need to take a look at Lagrange and minimize it as a function of lv and b so – and what is this?
How do you minimize Lagrange? So in order to minimize the Lagrange as a function of w and b, we do the usual thing. We take the derivatives of w – Lagrange with respect to w and b. And we set that to 0. That's how we minimize the Lagrange with respect to w and b.
So take the derivative with respect to w of the Lagrange. And I want – I just write down the answer. You know how to do calculus like this. So I wanna minimize this function of w, so I take the derivative and set it to 0. And I get that. And then so this implies that w must be that.
And so w, therefore, is actually a linear combination of your input feature vectors xi. This is sum of your various weights given by the alpha i's and times the xi's, which are your examples in your training set. And this will be useful later.
The other equation we have is – here, partial derivative of Lagrange with respect to p is equal to minus sum of i plus 1 to m [inaudible] for i. And so I'll just set that to equal to 0. And so these are my two constraints. And so [inaudible].
So what I'm going to do is I'm actually going to take these two constraints, and well, I'm going to take whatever I thought to be the value for w. And I'm gonna take what I've worked out to be the value for w, and I'll plug it back in there to figure out what the Lagrange really is when I minimize with respect to w. [Inaudible] and I'll deal with b in a second.
So let's see. So my Lagrange is 1/2 w transpose w minus. So this first term, w transpose w – this becomes sum y equals one to m, alpha i, yi, xi transpose. This is just putting in the value for w that I worked out previously.
But since this is w transpose w – and so when they expand out of this quadratic function, and when I plug in w over there as well, I find that I have that. Oh, where I'm using these angle brackets to denote end product, so this thing here, it just means the end product, xi transpose xj. And the first and second terms are actually the same except for the minus one half. So to simplify to be equal to that.
So let me go ahead and call this w of alpha. My duo problem is, therefore, the following. I want to maximize w of alpha, which is that [inaudible]. And I want to the – I realize the notation is somewhat unfortunate. I'm using capital W of alpha to denote that formula I wrote down earlier.
And then we also had our lowercase w. The original [inaudible] is the primal problem. Lowercase w transpose xi. So uppercase and lowercase w are totally different things, so unfortunately, the notation is standard as well, as far as I know, so. So the duo problem is that subject to the alpha [inaudible]related to 0, and we also have that the sum of i, yi, alpha i is related to 0.
Notification Switch
Would you like to follow the 'Machine learning' conversation and receive update notifications?