To analyze the dispersion of the data in a distribution, the data can be separated into |4| equal subgroups called quarters. These quarters are separated by quartiles.
-
In a data distribution arranged in ascending order, the quarters correspond to the |4| subgroups of the distribution that each contain the same amount of data.
-
In a data distribution placed in ascending order, the quartiles are the |3| values that separate the distribution into |4| equal quarters.
-
The 1st quartile, denoted |\boldsymbol{Q_1},| is the value that separates the first quarter from the rest of the distribution.
-
The 2nd quartile, denoted |\boldsymbol{Q_2},| is the value that separates the distribution into |2| equal parts. In other words, it is the median.
-
The 3rd quartile, denoted |\boldsymbol{Q_3},| is the value that separates the last quarter from the rest of the distribution.
Each quarter contains about |25\%| of the data in the distribution. This means that |25\%| of the data is less than the 1st quartile, |50\%| of the data is less than the 2nd quartile, and |75\%| of the data is less than the 3rd quartile.
Here is how to determine the quartiles of a data distribution.
-
Place the data in ascending order.
-
Separate the data distribution into |4| equal quarters.
-
Determine the value of the quartiles.
The quartiles are not necessarily values that are actually part of the distribution. This is because it depends on the total number of data. There are |2| different cases when the number of data is even and |2| different cases when the number of data is odd.
Even Number of Data
||\begin{alignat}{20}&&&\!\!\!\!\boldsymbol{\color{#ec0000}{\underbrace{Q_2}}}\\1\ &\color{#3b87cd}{\Big\vert}\ 2\ &&\color{#ec0000}{\Big\vert}\ 3\ &&\color{#7cca51}{\Big\vert}\ 4\\&\!\!\!\!\boldsymbol{\color{#3b87cd}{\overbrace{Q_1}}}&&&&\!\!\!\!\boldsymbol{\color{#7cca51}{\overbrace{Q_3}}}\end{alignat}||
In this case, the |3| quartiles are not data values in the distribution.
||\begin{alignat}{20}&&&\!\!\!\!\boldsymbol{\color{#ec0000}{\underbrace{Q_2}}}\\1\ &\color{#3b87cd}{\boxed{\boldsymbol{2}}}\ 3\ &&\color{#ec0000}{\Big\vert}\ 4\ &&\color{#7cca51}{\boxed{\boldsymbol{5}}}\ 6\\&\!\!\boldsymbol{\color{#3b87cd}{\overbrace{Q_1}}}&&&&\!\!\boldsymbol{\color{#7cca51}{\overbrace{Q_3}}}\end{alignat}||
In this case, |Q_2| is not a data value in the distribution, but |Q_1| and |Q_3| are data values in the distribution.
Odd Number of Data
||\begin{alignat}{20}&&&\!\!\boldsymbol{\color{#ec0000}{\underbrace{Q_2}}}\\1\ &\color{#3b87cd}{\Big\vert}\ 2\ &&\color{#ec0000}{\boxed{\boldsymbol{3}}}\ 4\ &&\color{#7cca51}{\Big\vert}\ 5\\&\!\!\!\!\boldsymbol{\color{#3b87cd}{\overbrace{Q_1}}}&&&&\!\!\!\!\boldsymbol{\color{#7cca51}{\overbrace{Q_3}}}\end{alignat}||
In this case, |Q_2| is a data value in the distribution, while |Q_1| and |Q_3| are not data values in the distribution.
||\begin{alignat}{20}&&&\!\!\boldsymbol{\color{#ec0000}{\underbrace{Q_2}}}\\1\ &\color{#3b87cd}{\boxed{\boldsymbol{2}}}\ 3\ &&\color{#ec0000}{\boxed{\boldsymbol{4}}}\ 5\ &&\color{#7cca51}{\boxed{\boldsymbol{6}}}\ 7\\&\!\!\boldsymbol{\color{#3b87cd}{\overbrace{Q_1}}}&&&&\!\!\boldsymbol{\color{#7cca51}{\overbrace{Q_3}}}\end{alignat}||
In this case, the |3| quartiles are data values in the distribution.
It is important to first determine the value of the median |(Q_2)| before determining the values of |Q_1| and |Q_3.|
Here is an example where the total number of data is odd.
Determine the value of the |3| quartiles in the following distribution:
|4,| |9,| |2,| |5,| |10,| |2,| |7,| |6,| |9,| |1,| |3,| |5,| |6|
-
Place the data in ascending order
|1,| |2,| |2,| |3,| |4,| |5,| |5,| |6,| |6,| |7,| |9,| |9,| |10|
-
Separate the data distribution into |\boldsymbol{4}| equal quarter
This distribution has an odd number of data values |(13).| Therefore, |Q_2| is the centre data value of the distribution which separates it into |2| subgroups of |6| data values. |Q_1| and |Q_3| are thus located between data values, to create |4| quarters containing |3| data values each.||\begin{alignat}{20}&&&\!\!\boldsymbol{\color{#ec0000}{\underbrace{Q_2}}}\\1,2,2\ &\color{#3b87cd}{\Big\vert}\ 3,4,5\ &&\color{#ec0000}{\boxed{\boldsymbol{5}}}\ 6,6,7\ &&\color{#7cca51}{\Big\vert}\ 9,9,10\\&\!\!\!\!\boldsymbol{\color{#3b87cd}{\overbrace{Q_1}}}&&&&\!\!\!\!\boldsymbol{\color{#7cca51}{\overbrace{Q_3}}}\end{alignat}||
-
Determine the value of the quartiles
We start by determining the value of the median |(Q_2),| which corresponds to the 7th data value.||\boldsymbol{\color{#ec0000}{Q_2}}=\boldsymbol{\color{#ec0000}{5}}||Next, we calculate the 1st quartile |(Q_1),| which corresponds to the mean (average) of the 3rd and 4th data values.||\begin{alignat}{20}1,2,\boldsymbol{2}\ &\color{#3b87cd}{\Big\vert}\ \boldsymbol{3},4,5\ \color{#ec0000}{\boxed{\boldsymbol{5}}}\ 6,6,7\ \color{#7cca51}{\Big\vert}\ 9,9,10\\&\!\!\!\!\boldsymbol{\color{#3b87cd}{\overbrace{Q_1}}}\end{alignat}||||\boldsymbol{\color{#3b87cd}{Q_1}}=\dfrac{2+3}{2}=\boldsymbol{\color{#3b87cd}{2.5}}||Finally, we calculate the 3rd quartile |(Q_3),| which corresponds to the mean of the 10th and 11th data values.||\begin{alignat}{20}1,2,2\ \color{#3b87cd}{\Big\vert}\ 3,4,5\ \color{#ec0000}{\boxed{\boldsymbol{5}}}\ 6,6,\boldsymbol{7}\ &&&&&\color{#7cca51}{\Big\vert}\ \boldsymbol{9},9,10\\&&&&&\!\!\!\!\boldsymbol{\color{#7cca51}{\overbrace{Q_3}}}\end{alignat}||||\boldsymbol{\color{#7cca51}{Q_3}}=\dfrac{7+9}{2}=\boldsymbol{\color{#7cca51}{8}}||Answer: The 1st quartile of the distribution is |2.5,| the 2nd quartile is |5,| and the 3rd quartile is |8.|||\begin{alignat}{20}&&&\!\!\!\!\!\!\!\!\!\boldsymbol{\color{#ec0000}{\underbrace{\phantom{|}Q_2=5\phantom{|}}}}\\1,2,2\ &\color{#3b87cd}{\Big\vert}\ 3,4,5\ &&\color{#ec0000}{\boxed{\boldsymbol{5}}}\ 6,6,7\ &&\color{#7cca51}{\Big\vert}\ 9,9,10\\&\!\!\!\!\!\!\!\!\!\!\!\!\boldsymbol{\color{#3b87cd}{\overbrace{Q_1=2.5}}}&&&&\!\!\!\!\!\!\!\!\!\!\!\boldsymbol{\color{#7cca51}{\overbrace{\phantom{|}Q_3=8\phantom{|}}}}\end{alignat}||
Here is an example where the total number of data is even.
Determine the value of the |3| quartiles in the following distribution:
|60,| |32,| |87,| |98,| |56,| |75,| |35,| |68,| |86,| |90,| |75,| |59,| |61,| |84,| |64,| |48|
-
Place the data in ascending order
|32,| |35,| |48,| |56,| |59,| |60,| |61,| |64,| |68,| |75,| |75,| |84,| |86,| |87,| |90,| |98|
-
Separate the data distribution into |\boldsymbol{4}| equal quarters
This distribution has an even number of data |(16).| Therefore, |Q_2| is located between the |2| data at the centre of the distribution and separates it into |2| subgroups of |8| data values. |Q_1| and |Q_3| are therefore also located between data values, so as to create |4| quarters containing |4| data values.||\begin{alignat}{20}&&&\!\!\!\!\boldsymbol{\color{#ec0000}{\underbrace{Q_2}}}\\32,35,48,56\ &\color{#3b87cd}{\Big\vert}\ 59,60,61,64\ &&\color{#ec0000}{\Big\vert}\ 68,75,75,84\ &&\color{#7cca51}{\Big\vert}\ 86,87,90,98\\&\!\!\!\!\boldsymbol{\color{#3b87cd}{\overbrace{Q_1}}}&&&&\!\!\!\!\boldsymbol{\color{#7cca51}{\overbrace{Q_3}}}\end{alignat}||
-
Determine the value of the quartiles
First, we determine the value of the median |(Q_2),| which is the mean (average) of the 8th and 9th data values.
||\begin{alignat}{20}&\!\!\!\!\boldsymbol{\color{#ec0000}{\underbrace{Q_2}}}\\32,35,48,56\ \color{#3b87cd}{\Big\vert}\ 59,60,61,\boldsymbol{64}\ &\color{#ec0000}{\Big\vert}\ \boldsymbol{68},75,75,84\ \color{#7cca51}{\Big\vert}\ 86,87,90,98\\\phantom{\boldsymbol{\overbrace{Q_1}}}\end{alignat}||||\boldsymbol{\color{#ec0000}{Q_2}}=\dfrac{64+68}{2}=\boldsymbol{\color{#ec0000}{66}}||Next, we calculate the 1st quartile |(Q_1),| which corresponds to the mean of the 4th and 5th data values.||\begin{alignat}{20}32,35,48,\boldsymbol{56}\ &\color{#3b87cd}{\Big\vert}\ \boldsymbol{59},60,61,64\ \color{#ec0000}{\Big\vert}\ 68,75,75,84\ \color{#7cca51}{\Big\vert}\ 86,87,90,98\\&\!\!\!\!\boldsymbol{\color{#3b87cd}{\overbrace{Q_1}}}\end{alignat}||||\boldsymbol{\color{#3b87cd}{Q_1}}=\dfrac{56+59}{2}=\boldsymbol{\color{#3b87cd}{57.5}}||Finally, we calculate the 3rd quartile |(Q_3),| which corresponds to the mean of the 12th and 13th data values.
||\begin{alignat}{20}32,35,48,56\ \color{#3b87cd}{\Big\vert}\ 59,60,61,64\ \color{#ec0000}{\Big\vert}\ 68,75,75,\boldsymbol{84}\ &&&&&\color{#7cca51}{\Big\vert}\ \boldsymbol{86},87,90,98\\&&&&&\!\!\!\!\boldsymbol{\color{#7cca51}{\overbrace{Q_3}}}\end{alignat}||||\boldsymbol{\color{#7cca51}{Q_3}}=\dfrac{84+86}{2}=\boldsymbol{\color{#7cca51}{85}}||Answer: The 1st quartile of the distribution is |57.5,| the 2nd quartile is |66,| and the 3rd quartile is |85.|
||\begin{alignat}{20}&&&\!\!\!\!\!\!\!\!\!\!\!\boldsymbol{\color{#ec0000}{\underbrace{Q_2=66}}}\\32,35,48,56\ &\color{#3b87cd}{\Big\vert}\ 59,60,61,64\ &&\color{#ec0000}{\Big\vert}\ 68,75,75,84\ &&\color{#7cca51}{\Big\vert}\ 86,87,90,98\\&\!\!\!\!\!\!\!\!\!\!\!\!\!\!\boldsymbol{\color{#3b87cd}{\overbrace{Q_1=57.5}}}&&&&\!\!\!\!\!\!\!\!\!\!\!\boldsymbol{\color{#7cca51}{\overbrace{Q_3=85}}}\end{alignat}||
It is possible to convert quartiles into percentiles and vice versa.
After determining the value of the quartiles, it is possible to analyze the dispersion of the data in a distribution. To do so, the interquartile range can be used.
The interquartile range, denoted |\boldsymbol{IR},| corresponds to the range between the 1st quartile |(Q_1)| and the 3rd quartile |(Q_3).|
The interquartile range represents the dispersion of the quarters on either side of the median |(Q_2).| In other words, it gives an idea of how concentrated the data at the centre of the distribution is. In a box and whisker plot, the interquartile range is the width of the box on the diagram.
To find the value of this range, use the following calculation:
||IR=Q_3-Q_1||
where
|IR:| interquartile range
|Q_1:| value of 1st quartile
|Q_3:| value of 3rd quartile
Determine the interquartile range of the following distribution.
||\begin{alignat}{20}&&&\!\!\!\!\!\!\!\!\!\!\!\boldsymbol{\color{#ec0000}{\underbrace{Q_2=66}}}\\32,35,48,56\ &\color{#3b87cd}{\Big\vert}\ 59,60,61,64\ &&\color{#ec0000}{\Big\vert}\ 68,75,75,84\ &&\color{#7cca51}{\Big\vert}\ 86,87,90,98\\&\!\!\!\!\!\!\!\!\!\!\!\!\!\!\boldsymbol{\color{#3b87cd}{\overbrace{Q_1=57.5}}}&&&&\!\!\!\!\!\!\!\!\!\!\!\boldsymbol{\color{#7cca51}{\overbrace{Q_3=85}}}\end{alignat}||
According to the formula, we calculate the following:
||\begin{align}IR&=Q_3-Q_1\\&=85-57.5\\&=27.5\end{align}||
In other words, |50%| of the data that are around the median |(Q_2)| are concentrated inside an interval of |27.5| units.
Having determined the value of the quartiles, the minimum and the maximum, it is possible to analyze the dispersion of the data in each of the quarters of a distribution. This can be done using the quarter range.
The quarter range, denoted |QR,| corresponds to the range between the values at the ends of one quarter of a data distribution.
The quarter range gives an idea of the dispersion of each |25\%| slice of the data in the distribution.
To find the value of the quarter range, calculate as follows:
||\begin{align}QR_1&=Q_1-x_\text{min}\\QR_2&=Q_2-Q_1\\QR_3&=Q_3-Q_2\\QR_4&=x_\text{max}-Q_3\end{align}||
where
|QR:| quarter range
|x_\text{min}:| minimum value in the distribution
|Q_1:| value of the 1st quartile
|Q_2:| value of the 2nd quartile
|Q_3:| value of the 3rd quartile
|x_\text{max}:| maximum value in the distribution
The distribution should be analyzed to detect if there are any outliers before determining |x_\text{min}| and |x_\text{max}.|
Determine the quarter range of the following distribution.
||\begin{alignat}{20}&&&\!\!\!\!\!\!\!\!\!\!\!\boldsymbol{\color{#ec0000}{\underbrace{Q_2=66}}}\\32,35,48,56\ &\color{#3b87cd}{\Big\vert}\ 59,60,61,64\ &&\color{#ec0000}{\Big\vert}\ 68,75,75,84\ &&\color{#7cca51}{\Big\vert}\ 86,87,90,98\\&\!\!\!\!\!\!\!\!\!\!\!\!\!\!\boldsymbol{\color{#3b87cd}{\overbrace{Q_1=57.5}}}&&&&\!\!\!\!\!\!\!\!\!\!\!\boldsymbol{\color{#7cca51}{\overbrace{Q_3=85}}}\end{alignat}||
According to the formulas, the following calculations are carried out.
||\begin{align}QR_1&=Q_1-x_\text{min}\\&=57.5-32\\&=25.5\\\\QR_2&=Q_2-Q_1\\&=66-57.5\\&=8.5\\\\QR_3&=Q_3-Q_2\\&=85-66\\&=19\\\\QR_4&=x_\text{max}-Q_3\\&=98-85\\&=13\end{align}||
Therefore, it can be concluded that the 2nd quarter contains the most concentrated data |(QR_2=8.5)| and the 1st quarter contains the most dispersed data |(EQ_1=25.5).|