<< Chapter < Page | Chapter >> Page > |
But again, if you see me write some symbol and you don't quite remember what it means, chances are there are others in this class who've forgotten too. So please raise your hand and ask if you're ever wondering what some symbol means. Any questions you have about any of this?
Yeah?
Student: The variable can be anything? [Inaudible]?
Instructor (Andrew Ng) :Say that again.
Student: [inaudible] zero theta one?
Instructor (Andrew Ng) :Right, so, well let me – this was going to be next, but the theta or the theta Is are called the parameters. The thetas are called the parameters of our learning algorithm and theta zero, theta one, theta two are just real numbers. And then it is the job of the learning algorithm to use the training set to choose or to learn appropriate parameters theta.
Okay, is there other questions?
Student: What does [inaudible]?
Instructor (Andrew Ng) :Oh, transpose. Oh yeah, sorry. When [inaudible] theta and theta transpose X, theta [inaudible].
Student: Is this like a [inaudible] hypothesis [inaudible], or would you have higher orders? Or would theta [inaudible]?
Instructor (Andrew Ng) :All great questions. The answer – so the question was, is this a typical hypothesis or can theta be a function of other variables and so on. And the answer is sort of yes. For now, just for this first learning algorithm we'll talk about using a linear hypothesis class. A little bit actually later this quarter, we'll talk about much more complicated hypothesis classes, and we'll actually talk about higher order functions as well, a little bit later today.
Okay, so for the learning problem then. How do we chose the parameters theta so that our hypothesis H will make accurate predictions about all the houses. All right, so one reasonable thing to do seems to be, well, we have a training set. So – and just on the training set, our hypothesis will make some prediction, predictions of the housing prices, of the prices of the houses in the training set.
So one thing we could do is just try to make the predictions of a learning algorithm accurate on a training set. So given some features, X, and some correct prices, Y, we might want to make that theta square difference between the prediction of the algorithm and the actual price [inaudible].
So to choose parameters theta, unless we want to minimize over the parameters theta, so the squared area between the predicted price and the actual price. And so going to fill this in. We have M training examples. So the sum from I equals one through M of my M training examples, of price predicted on the Ith house in my training set. Mine is the actual target variable. Mine is actual price on the Ith training example.
And by convention, instead of minimizing this sum of the squared differences, I'm just going to put a one-half there, which will simplify some of the math we do later. Okay, and so let me go ahead and define J of theta to be equal to just the same, one-half sum from I equals one through M on the number of training examples, of the value predicted by my hypothesis minus the actual value.
Notification Switch
Would you like to follow the 'Machine learning' conversation and receive update notifications?