1.8 Machine learning lecture 9 (Page 3/13)

Machine learning Page 3 / 13

So I’m gonna assume that training examples we’ve drawn IID from some probability distributions, scripts D. Well, same thing for spam, if you’re trying to build a spam classifier then this would be the distribution of what emails look like comma, whether they are spam or not. And in particular, to understand or simplify – to understand the phenomena of bias invariance, I’m actually going to use a simplified model of machine learning. And in particular, logistic regression fits this parameters the model like this for maximizing the law of likelihood. But in order to understand learning algorithms more deeply, I’m just going to assume a simplified model of machine learning, let me just write that down. So I’m going to define training error as – so this is a training error of a hypothesis X subscript data. Write this epsilon hat of subscript data. If I want to make the dependence on a training set explicit, I’ll write this with a subscript S there where S is a training set. And I’ll define this as, let’s see. Okay. I hope the notation is clear. This is a sum of indicator functions for whether your hypothesis correctly classifies the Y – the IFE example.

And so when you divide by M, this is just in your training set what’s the fraction of training examples your hypothesis classifies so defined as a training error. And training error is also called risk. The simplified model of machine learning I’m gonna talk about is called empirical risk minimization. And in particular, I’m going to assume that the way my learning algorithm works is it will choose parameters data, that minimize my training error. Okay? And it will be this learning algorithm that we’ll prove properties about. And it turns out that you can think of this as the most basic learning algorithm, the algorithm that minimizes your training error.

It turns out that logistic regression and support vector machines can be formally viewed as approximation cities, so it turns out that if you actually want to do this, this is a nonconvex optimization problem. This is actually – it actually [inaudible] hard to solve this optimization problem. And logistic regression and support vector machines can both be viewed as approximations to this nonconvex optimization problem by finding the convex approximation to it. Think of this as similar to what algorithms like logistic regression are doing. So let me take that definition of empirical risk minimization and actually just rewrite it in a different equivalent way. For the results I want to prove today, it turns out that it will be useful to think of our learning algorithm as not choosing a set of parameters, but as choosing a function. So let me say what I mean by that. Let me define the hypothesis class, script h, as the class of all hypotheses of – in other words as the class of all linear classifiers, that your learning algorithm is choosing from. Okay? So H subscript data is a specific linear classifier, so H subscript data, in each of these functions – each of these is a function mapping from the input domain X is the class zero-one. Each of these is a function, and as you vary the parameter’s data, you get different functions. And so let me define the hypothesis class script H to be the class of all functions that say logistic regression can choose from. Okay.

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, Machine learning. OpenStax CNX. Oct 14, 2013 Download for free at http://cnx.org/content/col11500/1.4

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Machine learning' conversation and receive update notifications?

Ask

©flickr: Bertram	Chemistry Ch 1 2 Test 1 By Madison Christian Start Quiz
©flickr:	Math for Economists MCQ By Tony Pizur Start Quiz
	14 Dr Landholt Large Animal Medicine-GI quiz By Brooke Delaney Start Exam
	My OCA Mock By Mike Wolf Start Exam
	Immunology Practice Test By Sandhills MLT Start Test
	NCE Ch 08 Appraisal By Anh Dao Start Quiz
	Cardiac Electrophysiology Basic By Mistry Bhavesh Start Quiz
	Cultural Anthropology Assignment 2 By Richley Crapo Start Assignment
	47 Biology 47 Conservation Biology Biodiversity MCQ By OpenStax Start Quiz
	4 CDL Quiz - General Knowledge Part 2 By Jazzycazz Jackson Start Quiz