<< Chapter < Page | Chapter >> Page > |
MachineLearning-Lecture14
Instructor (Andrew Ng) :All right. Good morning. Just a couple quick announcements before I get started. One is you should have seen Ziko’s e-mail yesterday already. Several of you had asked me about – let you know [inaudible], so I wrote those up. We posted them online yesterday. The syllabus for the midterm is everything up to and including last Wednesday’s lecture, so I guess [inaudible]is on the syllabus. You can take at the notes if you want. And also practice midterm had been posted on the course website, so you can take a look at that, too. The midterm will be in Terman auditorium tomorrow at 6:00 p.m. Directions were sort of included – or links to directions were included in Ziko’s e-mail. And we actually at 6:00 p.m. sharp tomorrow, so do come a little bit before 6:00 p.m. to make sure you’re seated by 6:00 p.m. as we’ll hand out the midterms a few minutes before 6:00 p.m. and we’ll start the midterm at 6:00 p.m. Okay?
Are there any questions about midterms? Any logistical things? Are you guys excited? Are you looking forward to the midterm? All right. Okay. So welcome back, and what I want to do to is talk about – is wrap up our discussion on factor analysis, and in particular what I want to do is step through parts of the derivations for EM for factor analysis because again there are a few steps in the EM derivation that are particularly tricky, and there are specific mistakes that people often make on deriving EM algorithms for algorithms like factor analysis. So I wanted to show you how to do those steps right so you can apply the same ideas to other problems as well. And then in the second half or so of this lecture, I’ll talk about principal component analysis, which is a very powerful algorithm for dimensionality reduction. We’ll see later what that means.
So just a recap, in a previous lecture I described a few properties of Gaussian distributions. One was that if you have a random variable – a random value vector X that can be partitioned into two portions, X1 and X2, and if X is Gaussian with mu [inaudible] and covariance sigma where mu is itself a partition vector and sigma is sort of a partition matrix that can be written like that. So I’m just writing sigma in terms of the four sub-blocks. Then you can look at the distribution of X and ask what is the marginal distribution of say X1. And the answer we said last time was that X1 – the marginal distribution of X1 is Gaussian would mean mu and covariance sigma one one, whereas sigma one one is the upper left block of that covariance matrix sigma. So this one is no surprise.
And I also wrote down the formula for computing conditional distributions, such as what is P of X1 given X2, and last time I wrote down that the distribution of X1 given X2 would also be Gaussian with parameters that I wrote as mu of one given two and sigma of one given two where mu of one given two is – let’s see [inaudible] this formula. Okay? So with these formulas will be able to locate a pair of joint Gaussian random variables – X1 and X here are both vectors – and compute the marginal and conditional distributions, so P of X1 or P of X1 given X2. So when I come back and derive the E set – actually, I’ll come back and use the marginal formula in a second, and then when I come back and derive from the E step in the EM algorithm for factor analysis, I’ll actually be using these two formulas again.
Notification Switch
Would you like to follow the 'Machine learning' conversation and receive update notifications?