Content code
m1377
Slug (identifier)
linear-correlation-coefficient
Grades
Secondary IV
Topic
Mathematics
Tags
linear correlation
correlation coefficient
correlation
linear regression
regression line
coefficient
scatter plot
double-entry table
coefficient calculation
linear correlation coefficient
correlation coefficient formula
Content
Contenu
Corps

One of the uses of a scatter plot is to predict future results. To quantify the accuracy of these estimates, we calculate the linear correlation coefficient.

Links
Content
Corps

​The linear correlation coefficient, generally denoted by |r|, quantifies the strength of the linear relationship between the two variables of a distribution. It can be determined by estimating from a graph or by using a mathematical formula.

The correlation coefficient will always have a value in the interval [|-1|, |1|].

Corps

The linear correlation coefficient of a distribution gives an idea of how the scatter plot looks, and vice versa. First off, the sign of the coefficient, positive or negative, indicates the direction of the slope of the regression line. To understand the correlation coefficient, here are three scatter plots that illustrate the extreme values, namely, |-1|, |0| and |1|.

Columns number
3 columns
Format
33% / 33% / 33%
First column
Image
This graph shows a scatter plot whose linear correlation is perfect and negative.
Second column
Image
This graph shows a scatter plot with zero linear correlation.
Third column
Image
This graph shows a scatter plot whose linear correlation is perfect and positive.
Corps

In other words, the closer the value of the linear correlation coefficient is to |1| or |-1|, the stronger the linear relationship between the two variables.

Conversely, the closer the value is to |0|, the weaker the linear relationship between the two variables.

Title (level 2)
Qualitative Assessment According to a Scatter Plot
Title slug (identifier)
qualitative-assessment-scatter-plot
Contenu
Corps

To calculate the values of |r|, use a graph or calculate the value with a formula. On the other hand, to simply compare the linearity of a graph to another, just take a look at the scatter plot and the alignment of the points.

Content
Columns number
2 columns
Format
50% / 50%
First column
Image
This graph shows a scatter plot with a strong and positive linear correlation.
Second column
Image
This graph shows a scatter plot with a moderate and positive linear correlation.
Corps

Looking closely at these graphs, the points are more dispersed in the second scatter plot. Thus, the linear correlation coefficient is lower in this plot than in the first.

The difference between correlation coefficients can be seen clearly in the following scatter plots.

Corps

Negative Linear Correlations

Columns number
3 columns
Format
33% / 33% / 33%
First column
Image
This graph shows a scatter plot with a strong and negative linear correlation.
Second column
Image
This graph shows a scatter plot with a negative and moderate linear correlation.
Third column
Image
This graph shows a scatter plot with a negative and weak linear correlation.
Corps

Positive Linear Correlations

Columns number
3 columns
Format
33% / 33% / 33%
First column
Image
This graph shows a scatter plot with a strong and positive linear correlation.
Second column
Image
This graph shows a scatter plot with a moderate and positive linear correlation.
Third column
Image
This graph shows a scatter plot with a weak and positive linear correlation.
Corps

Depending on the value of the correlation coefficient, we see that the points of scatter plot become increasingly dispersed. On the other hand, it is always possible to find the direction of the scatter plot (positive or negative). When the points are so widely dispersed that it becomes impossible to determine their direction, the linear correlation coefficient is zero.

Title (level 2)
Qualitative Assessment Using a Double Entry (Two-Variable) Table
Title slug (identifier)
qualitative-assessment-table
Contenu
Corps

To simplify the visual representation of the collected data, the data is sometimes grouped into classes and placed in a double entry (two-variable) table. 

Content
Corps

To go from a scatter plot to a double entry (two-variable) table, segment the scatter plot in order to clearly define each of the classes.

Columns number
2 columns
Format
50% / 50%
First column
Corps

So, this scatter plot...

Image
This image represents a scatter plot whose correlation is positive and strong.
Second column
Corps

... becomes the following double entry (two-variable) table.

Image
This image shows a double-entry table with a strong and positive correlation, since the data is clustered near the diagonal.
Corps

Once this table is obtained, it is possible to predict the correlation of the data.

Content
Columns number
2 columns
Format
50% / 50%
First column
Corps

According to the previous double entry (two-variable) table, the correlation is strong and positive.

It is positive, because the more the data increases in |x|, the more the data increases in |y|.

It is strong because the data is grouped near the diagonal of the double-entry table.

Second column
Image
This image shows a double-entry (two-variable) table with a strong and positive correlation, because the data is clustered near the diagonal.
Corps

Note: if the data clusters around the other diagonal, i.e., the diagonal that starts at the bottom left and ends at the top right, then the correlation will be negative. 

Title (level 2)
Calculating the Linear Correlation Coefficient
Title slug (identifier)
calculating-linear-correlation-coefficient
Contenu
Links
Corps

By determining more precisely the value of the linear correlation coefficient, it is easier to quantify the correlation between two variables.

Content
Corps

||r\approx\pm\left(1-\dfrac{w}{L}\right)||where

|L\!:| the length of the rectangle outlining the scatter plot
|w\!:| the width of the rectangle outlining the scatter plot

As for the sign of |r|, it is determined according to the direction of the scatter plot.

Corps

In general, this formula makes it possible to find a value that is fairly representative of the linear correlation coefficient. On the other hand, there are more sophisticated tools which accurately calculate this value.

Generally, the following values ​​will be used to qualify the linear correlation.

​Value of |r|

Strength of the linear relationship

Close to |0|

None

Near |\pm\, 0{.}50|

Weak

Near |\pm\, 0{.}75|

Moderate

Near |\pm\, 0{.}87|

Strong

Near |\pm\, 1|

Very strong

|\pm\, 1|

Perfect

Title (level 3)
Calculating the Linear Correlation Coefficient From a Graph
Title slug (identifier)
calculating-the-coefficient-graph
Corps

To associate a numerical value with the correlation coefficient, follow these 3 steps.

Content
Corps
  1. Draw the scatter plot.

  2. Draw a rectangle and measure its length and width.

  3. Calculate the correlation coefficient using the formula.

Content
Columns number
2 columns
Format
50% / 50%
First column
Corps
  1. Draw the scatter plot

    By placing each of the points in a Cartesian plane, the following scatter plot is obtained.

Second column
Image
This graph shows a scatter plot with a moderate and positive linear correlation.
Columns number
2 columns
Format
50% / 50%
First column
Corps
  1. Draw a rectangle and measure its length and width

    The rectangle must  contain each point and be as small as possible. When tracing the rectangle, use a set square  and measure the segments.

    Since there are no outliers or abnormal data, the following rectangle is obtained.

Second column
Image
The graph shows a scatter plot with a positive correlation, outlined by a rectangle.
Columns number
2 columns
Format
50% / 50%
First column
Corps
  1. Calculate the correlation coefficient using the formula

Second column
Corps

|r \approx \pm \left(1 - \dfrac{2.4}{6.2} \right)|
|r \approx \pm 0{.}61|
|r \approx 0{.}61|, since the scatter plot is positive.

Title (level 3)
Calculating the Linear Correlation Coefficient Using Technological Tools
Title slug (identifier)
calculating-the-coefficient-algebraic-method
Corps

With graphing calculators or software such as spreadsheets, a much more precise correlation coefficient can be obtained. Just enter all the data in a table of values, select the correct function, and let the software do the calculations.

Content
Corps

The formula for precisely calculating the linear correlation coefficient |r|, is the following. ||r=\dfrac{\sum\left(x-\overline{x}\right)\left(y-\overline{y}\right)}{\sqrt{\sum\left(x-\overline{x}\right)^{2}}\sqrt{\sum\left(y-\overline{y}\right)^{2}}}||

where

|x\!:| a value in the first distribution
|\overline{x}\!:| the mean of the first distribution
|y\!:| a value in the second distribution
|\overline{y}\!:| the mean of the second distribution
|\sum\!:| symbol that signifies the sum of...

Title (level 2)
See Also
Title slug (identifier)
see-also
Contenu
Links
Remove audio playback
No
Printable tool
Off