<< Chapter < Page Chapter >> Page >
Consider a pair {X,Y} with a joint distribution. A value X(ω) is observed. It is desired to estimate the corresponding value Y(ω). The best that can be hoped for is some estimate based on an average of the errors, or on the average of some function of the errors. The most common measure of error is the mean (expectation) of the square of the error. This has two important properties: it treats positive and negative errors alike, and it weights large errors more heavily than smaller ones. In general, we seek a rule (function) r such that the estimate is r(X(ω)). That is, we seek a function r such that the expectation of the square of Y - r(X) is a minimum. The problem of determining such a function is known as the regression problem.LINEAR REGRESSION: we seek the best straight line function (the regression line of Y on X) of the form u = r(t) + b, such that the mean square of Y - r(X) is a minimum. Matlab approximation procedures are compared with analytic results. More general linear regression is considered

Linear regression

Suppose that a pair { X , Y } of random variables has a joint distribution. A value X ( ω ) is observed. It is desired to estimate the corresponding value Y ( ω ) . Obviously there is no rule for determining Y ( ω ) unless Y is a function of X . The best that can be hoped for is some estimate based on an average of the errors, or on the average of some function of the errors.

Suppose X ( ω ) is observed, and by some rule an estimate Y ^ ( ω ) is returned. The error of the estimate is Y ( ω ) - Y ^ ( ω ) . The most common measure of error is the mean of the square of the error

E [ ( Y - Y ^ ) 2 ]

The choice of the mean square has two important properties: it treats positive and negative errors alike, and it weights large errors more heavily than smaller ones.In general, we seek a rule (function) r such that the estimate Y ^ ( ω ) is r X ( ω ) . That is, we seek a function r such that

E [ ( Y - r ( X ) ) 2 ] is a minimum.

The problem of determining such a function is known as the regression problem . In the unit on Regression , we show that this problem is solved by the conditional expectation of Y , given X . At this point, we seek an important partial solution.

The regression line of Y on X

We seek the best straight line function for minimizing the mean squared error. That is, we seek a function r of the form u = r ( t ) = a t + b . The problem is to determine the coefficients a , b such that

E [ ( Y - a X - b ) 2 ] is a minimum

We write the error in a special form, then square and take the expectation.

Error = Y - a X - b = ( Y - μ Y ) - a ( X - μ X ) + μ Y - a μ X - b = ( Y - μ Y ) - a ( X - μ X ) - β
Error squared = ( Y - μ Y ) 2 + a 2 ( X - μ X ) 2 + β 2 - 2 β ( Y - μ Y ) + 2 a β ( X - μ X ) - 2 a ( Y - μ Y ) ( X - μ X )
E [ ( Y - a X - b ) 2 ] = σ Y 2 + a 2 σ X 2 + β 2 - 2 a Cov [ X , Y ]

Standard procedures for determining a minimum (with respect to a ) show that this occurs for

a = Cov [ X , Y ] Var [ X ] b = μ Y - a μ X

Thus the optimum line, called the regression line of Y on X , is

u = Cov [ X , Y ] Var [ X ] ( t - μ X ) + μ Y = ρ σ Y σ X ( t - μ X ) + μ Y = α ( t )

The second form is commonly used to define the regression line. For certain theoretical purposes, this is the preferred form. But for calculation , the first form is usually the more convenient. Only the covariance (which requres both means) andthe variance of X are needed. There is no need to determine Var [ Y ] or ρ .

Questions & Answers

what is the anterior
Tito Reply
Means front part of the body
Ibrahim
what is anatomy
Ruth Reply
To better understand how the different part of the body works. To understand the physiology of the various structures in the body. To differentiate the systems of the human body .
Roseann Reply
what is hypogelersomia
aliyu Reply
what are the parts of the female reproductive system?
Orji Reply
what is anatomy
Divinefavour Reply
what are the six types of synovial joints and their ligaments
Darlington Reply
draw the six types of synovial joint and their ligaments
Darlington
System of human beings
Katumi Reply
System in humans body
Katumi
Diagram of animals and plants cell
Favour Reply
at what age does development of bone end
Alal Reply
how many bones are in the human upper layers
Daniel Reply
how many bones do we have
Nbeke
bones that form the wrist
Priscilla Reply
yes because it is in the range of neutrophil count
Alexander Reply
because their basic work is to fight against harmful external bodies and they are always present when chematoxin are released in an area in body
Alexander
What is pathology
Samuel Reply
what is pathology
Nbeke
what's pathology
Nbeke
what is anatomy
ESTHER Reply
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Applied probability. OpenStax CNX. Aug 31, 2009 Download for free at http://cnx.org/content/col10708/1.6
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Applied probability' conversation and receive update notifications?

Ask