Table of Contents

Chapter 8 - Linear Regression


Thursday, 22 September 2022
1-minute read
173 words

Line of Best Fit

\[ \hat{y} = a + bx \]

Points on the line of best fit are predicted y-values, which is why $\hat{y}$ notation is used.

The value of $r^2$ is the percent of the data's variation in the y-value that is accounted for by the variation of the x-value.

Example

Explanatory variable: miles traveled in a car

Response variable: gallons of gasoline used

R-squared value: 0.87

This means 87% of the change of gallons can be attributed to the change in miles traveled in a car.

Residuals

Residuals show how much individual data points differ from their corresponding predicted values.

Their values can either be negative or positive.

\[ r = y - y_0 \]

If the residual plot is randomly scattered, then the data is linear. Otherwise, a pattern similar to higher degree functions may indicate a curve in the data.

92.2% of the change in calories can be attributed to its linear relationship with the change in the grams of fat. 8% can be attributed to unknown factors.