Table of Contents

Chapter 7 - Scatterplots, Association, and Correlation


Tuesday, 20 September 2022
2-minute read
220 words

Scatterplot

A scatterplot displays a set of ordered pairs. It is a common way of displaying two-variable data.

Scatterplots can have varying degrees of linearity.

  • Random scatter - no pattern can be determined in the data
  • Positive relationship - positive slope
  • Negative relationship - negative slope

Important Ideas

  • Association

    • linear
    • non-linear
  • Correlation - only use correlation if there is a linear association

    • positive
    • negative
    • constant
  • Strength - how "strong" or tight the linear correlation is

    • strong
    • moderate to strong
    • moderate
    • weak to moderate
    • weak
  • Unusual Features

    • outlier point
    • influential point
    • high-leverage point

Using your calculator to create scatterplots

Enter data into data list (customarily we put X in $L_1$ and Y in $L_2$)

We refer to X as the explanatory variable and Y as the response variable.

Correlation

  • Only applies to quantitative variables
  • Does your scatterplot look "approximately linear"?
  • Report correlation values both with and without outliers to see if they make a big difference
  • $r$ = correlation coefficient

    • $-1 \le r \le 1$
    • r-value of $\pm 1$ means it is perfectly linear
    • r-value of $0$ would be random scatter

Interpreting Computer Output Example

Variable    Coefficient
Intercept   5773.27
Wins        517.609

For every increase by one win, a predicted amount of 517.609 people will attend the game.

There is a predicted amount of 5773.27 attendees if the team won 0 games.