Measure of Dispersion
Let's consider another example using a group of students and their weight. Suppose we have the following weights (in kilograms) for a class of students: 50, 55, 60, 65,
and 70
.
Range
The range is calculated by subtracting the minimum weight from the maximum weight. In this case, the range is calculated below and this tells us that the weights vary by 20 kilograms within the class.
Variance
Variance measures the average squared deviation of each weight from the mean. It gives us an idea of how spread out the weights are. To calculate the variance:
Standard Deviation
The standard deviation is the square root of the variance. It represents the average distance of each weight from the mean. In this case, the standard deviation is the square root of 50, which is approximately 7.07 kg.
Overall, these measures of dispersion help us understand how the weights of students are spread out within the class. A larger range, variance, or standard deviation indicates greater variability or dispersion of the weights, while a smaller value suggests that the weights are closer together.
Measure of dispersion using Python
Python provides various libraries, such as NumPy and Pandas, that make it easy to calculate measures of dispersion. Here's an example using the Numpy
library to calculate the range, variance, and standard deviation of the weight dataset.
In the code snippet above, the np.max()
and np.min()
functions find the maximum and minimum values in the dataset, allowing us to calculate the range. The np.var()
function calculates the variance, and the np.std()
function calculates the standard deviation.
👩🏾🎨 Practice: Measures of central tendency... 🎯
Consider the following dataset representing the test scores of a group of students in a class:
[85, 90, 75, 80, 95, 70, 85, 88, 82, 78]
- Calculate the range of the test scores.
- Calculate the variance of the test scores.
- Calculate the standard deviation of the test scores.
- Explain in simple terms what each of these measures of dispersion tells us about the spread or variability of the test scores.
➡️ Next, we'll be looking at
Correlation analysis
analysis 🎯.