11.6 Facts about the correlation coefficient for linear regression (Page 3/3)

Collaborative statistics-parzen Page 3 / 3

$H_{o}$ : $ρ$ = 0
$H_{a}$ : $ρ$ ≠ 0
$α$ = 0.05
The p-value is 0.026 (from LinRegTTest on your calculator or from computer software)
The p-value, 0.026, is less than the significance level of $α$ = 0.05
Decision: Reject the Null Hypothesis $H_{o}$
Conclusion: There is sufficient evidence to conclude that there is a significant linear relationship between $x$ and $y$ because the correlation coefficient is significantly different from 0.
Because $r$ is significant and the scatter plot shows a linear trend, the regression line can be used to predict final exam scores.

Method 2: using a table of critical values to make a decision

The 95% Critical Values of the Sample Correlation Coefficient Table at the end of this chapter (before the Summary ) may be used to give you a good idea of whether the computed value of $r$ is significant or not . Compare

r

to the appropriate critical value in the table. If

r

is not between the positive and negative critical values, then the correlation coefficient is significant. If

r

is significant, then you may want to use the line for prediction.

Suppose you computed $r = 0.801$ using $n = 10$ data points. $df = n - 2 = 10 - 2 = 8$ . The critical values associated with $df = 8$ are -0.632 and + 0.632. If $r$ $negative critical value$ or $r > positive critical value$ , then $r$ is significant. Since $r = 0.801$ and $0.801 > 0.632$ , $r$ is significant and the line may be used for prediction. If you view this example on a number line, it will help you.

Horizontal number line with values of -1, -0.632, 0, 0.632, 0.801, and 1. A dashed line above values -0.632, 0, and 0.632 indicates not significant values. — $r$ is not significant between -0.632 and +0.632. $r = 0.801 > +0.632$ . Therefore, $r$ is significant.

Suppose you computed $r = -0.624$ with 14 data points. $df = 14 - 2 = 12$ . The critical values are -0.532 and 0.532. Since $-0.624$ $-0.532$ , $r$ is significant and the line may be used for prediction

Horizontal number line with values of -0.624, -0.532, and 0.532. — $r = -0.624$ $-0.532$ . Therefore, $r$ is significant.

Suppose you computed $r = 0.776$ and $n = 6$ . $df = 6 - 2 = 4$ . The critical values are -0.811 and 0.811. Since $-0.811$ $0.776$ $0.811$ , $r$ is not significant and the line should not be used for prediction.

Horizontal number line with values -0.924, -0.532, and 0.532. — $-0.811$ $r = 0.776$ $0.811$ . Therefore, $r$ is not significant.

Third exam vs final exam example: critical value method

Consider the third exam/final exam example .
The line of best fit is: $\hat{y} = -173.51 + 4.83x$ with $r = 0.6631$ and there are $n = 11$ data points.
Can the regression line be used for prediction? Given a third exam score ( $x$ value), can we use the line to predict the final exam score (predicted $y$ value)?

$H_{o}$ : $ρ$ = 0
$H_{a}$ : $ρ$ ≠ 0
$α$ = 0.05
Use the "95% Critical Value" table for $r$ with $df = n - 2 = 11 - 2 = 9$
The critical values are -0.602 and +0.602
Since $0.6631 > 0.602$ , $r$ is significant.
Decision: Reject $H_{o}$ :
Conclusion:There is sufficient evidence to conclude that there is a significant linear relationship between $x$ and $y$ because the correlation coefficient is significantly different from 0.
Because $r$ is significant and the scatter plot shows a linear trend, the regression line can be used to predict final exam scores.

Additional practice examples using critical values

Suppose you computed the following correlation coefficients. Using the table at the end of the chapter, determine if $r$ is significant and the line of best fit associated with each $r$ can be used to predict a $y$ value. If it helps, draw a number line.

$r = -0.567$ and the sample size, $n$ , is 19. The $df = n - 2 = 17$ . The critical value is -0.456. $-0.567$ $-0.456$ so $r$ is significant.
$r = 0.708$ and the sample size, $n$ , is 9. The $df = n - 2 = 7$ . The critical value is 0.666. $0.708 > 0.666$ so $r$ is significant.
$r = 0.134$ and the sample size, $n$ , is 14. The $df = 14 - 2 = 12$ . The critical value is 0.532. 0.134 is between -0.532 and 0.532 so $r$ is not significant.
$r = 0$ and the sample size, $n$ , is 5. No matter what the dfs are, $r = 0$ is between the two critical values so $r$ is not significant.

Assumptions in testing the significance of the correlation coefficient

Testing the significance of the correlation coefficient requires that certain assumptions about the data are satisfied. The premise of this test is that the data are a sample of observed points taken from a larger population. We have not examined the entire population because it is not possible or feasible to do so. We are examining the sample to draw a conclusion about whether the linear relationship that we see between $x$ and $y$ in the sample data provides strong enough evidence so that we can conclude that there is a linear relationship between $x$ and $y$ in the population.

The regression line equation that we calculate from the sample data gives the best fit line for our particular sample. We want to use this best fit line for the sample as an estimate of the best fit line for the population. Examining the scatterplot and testing the significance of the correlation coefficient helps us determine if it is appropriate to do this.

The assumptions underlying the test of significance are:

There is a linear relationship in the population that models the average value of $y$ for varying values of $x$ . In other words, the expected value of $y$ for each particular value lies on a straight line in the population. (We do not know the equation for the line for the population. Our regression line from the sample is our best estimate of this line in the population.)
The $y$ values for any particular $x$ value are normally distributed about the line. This implies that there are more $y$ values scattered closer to the line than are scattered farther away. Assumption (1) above implies that these normal distributions are centered on the line: the means of these normal distributions of $y$ values lie on the line.
The standard deviations of the population $y$ values about the line are equal for each value of $x$ . In other words, each of these normal distributions of $y$ values has the same shape and spread about the line.
The residual errors are mutually independent (no pattern).

A downward sloping regression line is shown with the y values normally distributed about the line with equal standard deviations for each x value. For each x value, the mean of the y values lies on the regression line. More y values lie near the line than are scattered further away from the line. — The y values for each x value are normally distributed about the line with the same standard deviation. For each x value, the mean of the y values lies on the regression line. More y values lie near the line than are scattered further away from the line.

**With contributions from Roberta Bloom

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, Collaborative statistics-parzen remix. OpenStax CNX. Jul 15, 2009 Download for free at http://legacy.cnx.org/content/col10732/1.2

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Collaborative statistics-parzen remix' conversation and receive update notifications?

Ask

	U.s. history By OpenStax Read Online Course
	5 Microbiology Final Practice By Madison Christian Start Assignment
	39 Biology 39 The Respiratory System MCQ By OpenStax Start Quiz
©flickr: Elliott	Spanish Verbs Subject Pronouns By Mariah Hauptman Start Quiz
©flickr:	Word Roots and Prefixes By Ellie Banfield Start Quiz
	Anthropology Marriage Family Household By Richley Crapo Start Assignment
	23 Biology 23 Protists MCQ By OpenStax Start Quiz
	1 Understanding Societies 1 By Jessica Collett Start Exam
	4 Microbiology Final 4 By Madison Christian Start Quiz
	27 Biology 27 Animal Diversity MCQ By OpenStax Start Quiz