The mean (Avg) is a measure of central tendency that represents the centre of equilibrium of a distribution.
There are 2 types of means.
The arithmetic mean is the value that each data value in the distribution would have if the total was equally divided among them. It is calculated by adding all the data and dividing the sum by the number of data values in the distribution. It is calculated differently depending on how the data are presented.
|\text{Mean} = \dfrac{\text{Sum of all the data}}{\text{Number of data values}}|
To simplify notation, different symbols can be used.
-
When referring to the mean of a sample, the symbol |\overline x| is used.
-
When referring to the mean of a population, the Greek letter |\mu| is used.
The arithmetic mean is calculated in the same way in both cases.
The following is the number of goals scored by the Montreal Canadiens in their last |15| games:
|0,| |1,| |3,| |2,| |3,| |1,| |3,| |4,| |5,| |2,| |5,| |1,| |3,| |4,| |2|
What is the average number of goals scored by the Canadiens in their last 15 games?
||\begin{align} \text{Mean} &= \dfrac{\small0+1+3+2+3+1+3+4+5+2+5+1+3+4+2}{15} \\\\ &= \dfrac{39}{15} \\\\ &= 2.6\ \text{goals per game} \end{align}||
Answer: During the last |15| games, the Canadiens averaged |2.6| goals per game.
It is possible to find a missing data value if the mean of a distribution and all other data values are known. The definition of the mean can be used to solve this kind of problem.
For her Term 3 report card, Marie-Claude has set a goal of getting an |85\ \%| mean (average) in math. So far, she has earned the following results: |90\ \%,| |82\ \%,| and |81\ \%.|
If all the evaluations are equally weighted, what grade does Marie-Claude need to have on her last evaluation to reach her goal?
Since the mean is the value that each data would have if the total was equally divided among them, we can rephrase this and conclude that Marie-Claude needed to get |85\ \%| on all |4| evaluations. So if we add |85\ \%| together |4| times, we get |85\ \% \times 4 = 340\ \%.|
She must therefore accumulate a total of |340\ \%| on her |4| evaluations to reach her goal. However, she has already received |3| grades: |90\ \%,| |82\ \%,| and |81\ \%.| So, after |3| evaluations, she has accumulated a total of |90\ \% + 82\ \% + 81\ \% = 253\ \%.| This means that the missing value can be found by calculating |340\ \% - 253\ \% = 87\ \%.|
Answer: Marie-Claude needs a grade of |87\ %| to have an |85\ \%| average in Term 3.
It is also possible to solve this kind of problem using algebra. To do so, replace the unknown data value with |x| and calculate its value.
The mean of |5| data values is |35,| but only |4| of the |5| data values are known: |20,| |40,| |45| and |29.|
What is the missing data value?
Replace the missing data value with |x| and use the arithmetic mean formula.
||\begin{align} \text{Mean} &= \dfrac{\text{Sum of all data}}{\text{Number of data values}} \\ 35 &= \dfrac{20 + 40 + 45 + 29 + x}{5} \\ 35 &= \dfrac{134+x}{5} \end{align}||
Next, isolate |x.|
||\begin{align}35\boldsymbol{\color{#ec0000}{\times5}}&=\dfrac{134+x}{5}\boldsymbol{\color{#ec0000}{\times5}}\\175&=134+x\\175\boldsymbol{\color{#ec0000}{- 134}}&=134+x\boldsymbol{\color{#ec0000}{- 134}}\\41&=x\end{align}||
Answer: The missing data is |41.|
Sometimes a distribution contains values that are repeated several times. In this case, it is useful to group them together in a condensed data table. In this situation, the mean (average) is calculated as follows:
|\text{Mean} = \dfrac{\text{Sum of all data values multiplied by their frequencies}}{\text{Total number of data}}|
The age of 30 players on a sports team is represented in the following table:
Age | |7| | |8| | |9| | |10| |
---|---|---|---|---|
Frequency | |13| | |9| | |6| | |2| |
What is the mean (average) age of the players on this team?
Note that the age of |7| repeats |13| times |(7 \times 13),| age |8| repeats |9| times |(8 \times 9),| age |9| is present |6| times |(9 \times 6)| and age |10| is present |2| times |(10 \times 2).|
||\begin{align}\text{Mean}&= \dfrac{7 \times 13 + 8 \times 9 + 9 \times 6 + 10 \times 2}{13+9+6+2}\\ &= \dfrac{91+72+54+20}{30}\\&=\dfrac{237}{30}\\&= 7.9\ \text{years old}\end{align}||
Answer: The average age of the students on the team is |7.9| years old.
When data are grouped into classes (intervals), only the middle of each class is used to estimate the mean of such a distribution.
|\text{Mean} \approx \dfrac{\text{Sum of the middle of each class multiplied by their frequencies}}{\text{Total Number of data}}|
The following is the duration (in minutes) of bus commutes for |337| students to get to school.
Duration (minutes) | Frequency |
---|---|
|[10,15[| | |44| |
|[15,20[| | |58| |
|[20,25[| | |70| |
|[25,30[| | |81| |
|[30,35[| | |54| |
|[35,40[| | |30| |
What is the mean (average) bus commute time for these students?
First, the middle of each interval must be found. The mean is calculated using these median values.
Duration (minutes) | Middle of the class/interval | Frequency |
---|---|---|
|[10,15[| | |\dfrac{10+15}{2}=12.5| | |44| |
|[15,20[| | |\dfrac{15+20}{2}=17.5| | |58| |
|[20,25[| | |\dfrac{20+25}{2}=22.5| | |70| |
|[25,30[| | |\dfrac{25+30}{2}=27.5| | |81| |
|[30,35[| | |\dfrac{30+35}{2}=32.5| | |54| |
|[35,40[| | |\dfrac{35+40}{2}=37.5| | |30| |
We note that the value |12.5| is present |44| times |(12.5 \times 44),| that |17.5| is present |58| times in the distribution |(17.5 \times 58)| and so on. We get the following equation:
||\begin{align}\text{Mean}&\approx\dfrac{\left(\begin{alignat}{30}&12.5\times44&&+17.5\times58&&+22{,}5\times70\\+\ &27.5\times81&&+32.5\times54&&+37.5\times30 \end{alignat}\right)}{44+58+70+81+54+30}\\ &\approx\dfrac{\left(\begin{alignat}{32}&\ \ \ 550&&+1\ 015&&+1\ 575\\+\ &2\ 227.5&&+1\ 755&&+1\ 125 \end{alignat}\right)}{337}\\&\approx\dfrac{8\ 247.5}{337}\\&\approx 24.47\ \text{minutes/student}\end{align}||
Answer: On average, each student’s bus commute lasts about |24.47| minutes.
The mean has the advantage of taking into account all the data in a distribution. However, when there is an outlier, in other words, one that deviates from the other data in the distribution, the mean is affected. In this case, we can choose to calculate the mean without the outlier to get a measure that is more representative of the whole distribution.
Car traffic on Notre-Dame Street in Quebec City is observed between 12:00 and 13:00. |21| cars drove on the street on Monday, |34| cars on Tuesday, |46| cars on Wednesday, |19| cars on Thursday and |225| cars on Friday.
All the data are relatively close, except for the |225| value which is very far from the others. This is an outlier. Here are the means (averages) with and without the outlier.
With the outlier
||\begin{align}\text{Mean}&=\dfrac{21+34+46+19+225}{5}\\&=\dfrac{345}{5}\\&=69\ \text{cars/day}\end{align}||
Without the outlier
||\begin{align}\text{Mean}&=\dfrac{21+34+46+19}{4}\\&=\dfrac{120}{4}\\&=30\ \text{cars/day}\end{align}||
Note that the mean with the outlier is much higher than the mean without it. It does not accurately reflect all the data, since |4| of the |5| data values are significantly lower than this mean. The median, on the other hand, is not influenced by outliers. It can therefore be found and compared to the mean. In this example, the median is |34.| The mean calculated without the outlier is therefore more representative of the whole distribution.
The weighted mean is used when the data values are not all equally important. In this case, each value is given a weight (usually as a percentage).
The weighted mean is calculated as follows:
|\text{Weighted mean} = \text{Sum of the data values multiplied by their weight}|
When calculating a weighted mean, each data value is associated with a weight that gives it a particular importance in relation to the other data. The sum of all the weights, or all the coefficients, should be |100\%.|
The table below shows Alexander's results during his last exams, as well as their respective weight.
Alexander’s results | Weighting | |
---|---|---|
Test 1 | 82% | 20% |
Test 2 | 75% | 35% |
Test 3 | 86% | 45% |
What is Alexander's final grade?
It is helpful to convert the percentages into decimal numbers to facilitate the calculation.
||\begin{align}20\ \%&= 0.20\\35\ \%&=0.35\\45\ \%&=0.45\end{align}||
The weighted mean can now be found.
||\begin{align}\text{Weighted Mean}&= 82 \times 0.20 + 75 \times 0.35 + 86 \times 0.45\\&= 16.4 + 26.25 + 38.7\\&= 81.35\end{align}||
Answer: Alexander's final grade is |81.35\ \%.|
It is also possible to find a missing value using a weighted mean.
Despite all of Julian's good intentions, he is afraid of failing his history class. To fully understand his situation, he made the following table:
Evaluation | Result | Weighting |
---|---|---|
Evaluation 1 | |54\ \%| | |10\ \%| |
Evaluation 2 | |58\ \%| | |10\ \%| |
Evaluation 3 | |62\ \%| | |30\ \%| |
Evaluation 4 | |50\ \%| | |10\ \%| |
Evaluation 5 | ? | |40\ \%| |
What is the minimum result that Julian must have on his last evaluation to obtain a passing grade of |60\ \%| in his course?
We use |x| to represent the missing result in the weighted mean formula.
||\begin{align}60&=54\times0.10+58\times0.10+62\times0.30+50\times0.10+x\times0.40\\60&=5.4+5.8+18.6+5+0.4x\\60&=34.8+0.4x\\60\boldsymbol{\color{#ec0000}{-34.8}}&=34.8\boldsymbol{\color{#ec0000}{-34.8}}+0.4x\\25.2&=0.4x\\\color{#ec0000}{\dfrac{\color{black}{25.2}}{\boldsymbol{0.4}}}&=\color{#ec0000}{\dfrac{\color{black}{0.4x}}{\boldsymbol{0.4}}}\\63&=x\end{align}||
Answer: Julian must get a minimum of |63\ \%| on his last evaluation to pass history.
In the end, it is important to remember that whatever the nature of the mean to be calculated, it is rarely specified whether it is a weighted mean, an arithmetic mean or any other type of mean. At this point, it is up to the student to analyze the nature of the data and choose the most appropriate mean.