The double entry table is sometimes called a correlation table or a correlation matrix. It is used to represent a 2-variable distribution and to examine a possible correlation between these 2 variables. This type of table can be constructed for all types of variables and their characteristics.
The steps to follow to construct a double entry table are as follows:
-
Identify the independent variable and the dependent variable, if applicable.
-
Separate the data into classes, if needed.
-
Place the data or classes in the table headings.
-
Compile the pairs of data and record their frequency in the table.
The 30 students in a group are asked to note the colour of their eyes and hair. Since these colours are fairly standard, each student is asked to select the appropriate colours from the following:
- Hair colours: blond, chestnut, brown, black, or red
- Eye colours: blue, green, or brown
We get the following results:
(blond, blue); (blond, blue); (blond, green); (blond, brown); (chestnut, blue);
(chestnut, blue); (chestnut, blue); (chestnut, blue); (chestnut, green);
(brown, brown); (brown, brown); (brown, brown); (brown, brown); (brown, brown);
(brown, blue); (brown, blue); (brown, green); (brown, green); (brown, brown);
(brown, brown); (brown, brown); (brown, brown); (brown, brown); (brown, brown);
(black, blue); (black, green); (black, brown); (black, brown), (red, green)
Construct the double entry table associated with this situation.
-
Identify the independent variable and the dependent variable, if applicable
There is no obvious link between hair colour and eye colour. So, there is no dependent or independent variable.
-
Separate the data into classes, if needed
Since both variables are qualitative, no classes have to be created to group the data.
-
Place the data or classes in the table headings
Eye Colour | Hair Colour | |||||
---|---|---|---|---|---|---|
Blond | Chestnut | Brown | Black | Red | Total | |
Blue | ||||||
Green | ||||||
Brown | ||||||
Total |
Note: A Total row and column can be added to help interpret the survey data. Also, the 1st row and 1st column can be interchanged.
-
Compile the pairs of data and record their frequency in the table
Eye Colour | Hair Colour | |||||
---|---|---|---|---|---|---|
Blond | Chestnut | Brown | Black | Red | Total | |
Blue | |2| | |5| | |2| | |1| | |0| | |\boldsymbol{10}| |
Green | |1| | |1| | |2| | |1| | |1| | |\boldsymbol{6}| |
Brown | |1| | |5| | |6| | |2| | |0| | |\boldsymbol{14}| |
Total | |\boldsymbol{4}| | |\boldsymbol{11}| | |\boldsymbol{10}| | |\boldsymbol{4}| | |\boldsymbol{1}| | |\boldsymbol{30}| |
Each day in May, the maximum temperature of the day in degrees Celsius and the amount of rainfall in millimetres are measured. Here are the results:
Date | Temperature |(^\circ \text{C})| |
Amount of rainfall |(\text{mm})| |
Date | Temperature |(^\circ \text{C})| |
Amount of rainfall |(\text{mm})| |
---|---|---|---|---|---|
May 1st | |10| | |0| | May 17 | |7| | |3| |
May 2 | |12| | |1| | May 18 | |6| | |19| |
May 3 | |16| | |0| | May 19 | |14| | |14| |
May 4 | |16| | |0| | May 20 | |15| | |0| |
May 5 | |16| | |12| | May 21 | |15| | |0| |
May 6 | |15| | |5| | May 22 | |13| | |0| |
May 7 | |12| | |0| | May 23 | |8| | |2| |
May 8 | |10| | |2| | May 24 | |15| | |2| |
May 9 | |13| | |1| | May 25 | |17| | |1| |
May 10 | |14| | |0| | May 26 | |20| | |0| |
May 11 | |12| | |6| | May 27 | |22| | |0| |
May 12 | |13| | |0| | May 28 | |21| | |0| |
May 13 | |11| | |1| | May 29 | |14| | |0| |
May 14 | |14| | |0| | May 30 | |7| | |15| |
May 15 | |16| | |0| | May 31 | |8| | |0| |
May 16 | |5| | |1| |
Construct the double entry table associated with this situation.
-
Identify the independent variable and the dependent variable, if applicable
There is no obvious link between temperature and the amount of rainfall in a day.
-
Separate the data into classes, if needed
Since the 2 variables are quantitative, we can create classes for each. For the temperature, the data varies between |5| and |22\ ^\circ \text{C}.| So, we can decide to create the following 4 classes: |[5, 10[,| |[10, 15[,| |[15, 20[| and |[20, 25[.| For the amount of rainfall, the data varies from 0 to 19 mm, so we can decide to create the following 4 classes: |[0, 5[,| |[5, 10[,| |[10, 15[| and |[15, 20[.|
-
Place the data or classes in the table headings
Amount of rainfall |(\text{mm})| |
Temperature |(^\circ \text{C})| | |||||
---|---|---|---|---|---|---|
|[5, 10[| | |[10, 15[| | |[15, 20[| | |[20, 25[| | Total | ||
|[0, 5[| | ||||||
|[5, 10[| | ||||||
|[10, 15[| | ||||||
|[15, 20[| | ||||||
Total |
-
Compile the pairs of data and record their frequency in the table
Amount of rainfall |(\text{mm})| |
Temperature |(^\circ \text{C})| | |||||
---|---|---|---|---|---|---|
|[5, 10[| | |[10, 15[| | |[15, 20[| | |[20, 25[| | Total | ||
|[0, 5[| | |4| | |11| | |7| | |3| | |\boldsymbol{25}| | |
|[5, 10[| | |0| | |1| | |1| | |0| | |\boldsymbol{2}| | |
|[10, 15[| | |0| | |1| | |1| | |0| | |\boldsymbol{2}| | |
|[15, 20[| | |2| | |0| | |0| | |0| | |\boldsymbol{2}| | |
Total | |\boldsymbol{6}| | |\boldsymbol{13}| | |\boldsymbol{9}| | |\boldsymbol{3}| | |\boldsymbol{31}| |
30 cyclists were asked to calculate the number of kilometres they travelled in a day depending on the weather conditions. The survey produced the following pairs of responses:
(windy, 50); (sunny, 120); (sunny, 148); (cloudy, 42); (rainy, 0);
(rainy, 25); (windy, 43); (sunny, 114); (cloudy, 54); (rainy, 34);
(windy, 61); (cloudy, 69); (cloudy, 59); (cloudy, 71); (rainy, 32);
(windy, 54); (sunny, 109); (windy, 74); (rainy, 42); (cloudy, 72);
(sunny, 87); (windy, 122); (cloudy, 83); (sunny, 86); (rainy, 69);
(windy, 43); (cloudy, 0); (cloudy, 98); (sunny, 56); (rainy, 86)
Construct the double entry table associated with this situation.
-
Identify the independent variable and the dependent variable, if applicable
In this situation, we can assume that weather conditions have an impact on the number of kilometres travelled by cyclists. Therefore, the independent variable is weather conditions and the dependent variable is the number of kilometres travelled.
-
Separate the data into classes, if needed
For the weather conditions, there are no classes to create. However, it is possible to divide the number of kilometres travelled into the following classes: |[0, 30[ ,| |[30, 60[,| |[60, 90[,| |[90, 120[ | and |[120, 150[.|
-
Place the data or classes in the table headings
Typically, the independent variable is placed in the 1st row and the dependent variable in the 1st column.
Weather conditions |
Number of Kilometres Travelled | |||||
---|---|---|---|---|---|---|
|[0, 30[| | |[30, 60[| | |[60, 90[| | |[90, 120[| | |[120, 150[| | Total | |
Rainy | ||||||
Cloudy | ||||||
Windy | ||||||
Sunny | ||||||
Total |
-
Compile the pairs of data and record their frequency in the table
Weather conditions |
Number of Kilometres Travelled | |||||
---|---|---|---|---|---|---|
|[0, 30[| | |[30, 60[| | |[60, 90[| | |[90, 120[| | |[120, 150[| | Total | |
Rainy | |2| | |3| | |2| | |0| | |0| | |\boldsymbol{7}| |
Cloudy | |1| | |3| | |4| | |1| | |0| | |\boldsymbol{9}| |
Windy | |0| | |4| | |2| | |0| | |1| | |\boldsymbol{7}| |
Sunny | |0| | |1| | |2| | |2| | |2| | |\boldsymbol{7}| |
Total | |\boldsymbol{3}| | |\boldsymbol{11}| | |\boldsymbol{10}| | |\boldsymbol{3}| | |\boldsymbol{3}| | |\boldsymbol{30}| |
With these tables, it is possible to calculate conditional probability and relative frequencies. We can also use them to estimate the correlation between the two variables. The stronger the correlation, the more the data will cluster around one of the double entry table's diagonals.