-
Statistics is the branch of mathematics that involves collecting data and then analyzing it to form hypotheses which are used to predict events.
-
Statistical measures are the collected data and the data generated from calculations.
The data collected in a statistical study form a distribution.
-
A one-variable distribution is a data set collected from a statistical study of a single characteristic.
-
A two-variable distribution is a set of pairs of data collected from a statistical study of two characteristics.
2-variable distributions allow the study of the relationship between the two characteristics to establish possible correlations between them. Double entry tables and scatter plots are used to represent this type of distribution.
An outlier is a data value that is not consistent with the other data in the distribution.
An outlier can indicate an error in data collection or simply reflect a rare data item. It can impact data analysis. Therefore, we may decide to ignore it to get a more accurate data set analysis. The boundaries beyond which a data value is considered to be an outlier can be calculated using the concept of interquartile range.
In one neighbourhood, in 2022, all the houses for sale sold for between |\$225\ 000| and | \$450\ 000,| except for one which sold for |\$1\ 375\ 000| because it is an exceptional house. The price of this house is considered an outlier and we could decide to exclude this price from statistical calculations, such as the mean.