Linear Regression and Correlation: Homework is a part of Collaborative Statistics collection (col10522) by Barbara Illowsky and Susan Dean.
For each situation below, state the independent variable and the dependent variable.
A study is done to determine if elderly drivers are involved in more motor vehicle fatalities than all other drivers. The number of fatalities per 100,000 drivers is compared to the age of drivers.
A study is done to determine if the weekly grocery bill changes based on the number of family members.
Insurance companies base life insurance premiums partially on the age of the applicant.
Utility bills vary according to power consumption.
A study is done to determine if a higher education reduces the crime rate in a population.
Independent: Age; Dependent: Fatalities
Independent: Power Consumption; Dependent: Utility
For any prediction questions, the answers are calculated using the least squares (best fit) line equation cited in the solution.
For each age group, pick the midpoint of the interval for the x value. (For the 75+ group, use 80.)
Using “ages” as the independent variable and “Number of driver deaths per 100,000” as the dependent variable, make a scatter plot of the data.
Calculate the least squares (best–fit) line. Put the equation in the form of:
Find the correlation coefficient. Is it significant?
Pick two ages and find the estimated fatality rates.
Use the two points in (e) to plot the least squares line on your graph from (b).
Based on the above data, is there a linear relationship between age of a driver and driver fatality rate?
What is the slope of the least squares (best-fit) line? Interpret the slope.
The average number of people in a family that received welfare for various years is given below. (Source:
House Ways and Means Committee, Health and Human Services Department )
Year
Welfare family size
1969
4.0
1973
3.6
1975
3.2
1979
3.0
1983
3.0
1988
3.0
1991
2.9
Using “year” as the independent variable and “welfare family size” as the dependent variable, make a scatter plot of the data.
Calculate the least squares line. Put the equation in the form of:
Find the correlation coefficient. Is it significant?
Pick two years between 1969 and 1991 and find the estimated welfare family sizes.
Use the two points in (d) to plot the least squares line on your graph from (b).
Based on the above data, is there a linear relationship between the year and the average number of people in a welfare family?
Using the least squares line, estimate the welfare family sizes for 1960 and 1995. Does the least squares line give an accurate estimate for those years? Explain why or why not.
Are there any outliers in the above data?
What is the estimated average welfare family size for 1986? Does the least squares line give an accurate estimate for that year? Explain why or why not.
What is the slope of the least squares (best-fit) line? Interpret the slope.
-0.8533, Yes
No
No.
2.93, Yes
slope = -0.0432. As the year increases by one, the welfare family size tends to decrease by 0.0432 people.