Exercise: One-variable Linear Regression

Here is a sample dataset that contains the mid-term and final exam scores of 10 students. The goal is to predict the final exam score based on the mid-term exam score.

Mid-term	Final
70	80
80	90
90	100
60	70
50	65
40	50
30	40
20	30
10	20
0	10

Questions:

What is the number of features in this dataset?
What is the number of data points (m) in this dataset?
Which of these w and b values will give the smallest cost function?
- w = 1 and b = 10
- w = 0.5 and b = 20
- w = 3 and b = 7
Write a simple Python code to calculate the cost function for each set of w and b values and try different values to find the smallest cost function.

Answer

There is only one feature (X) in this dataset which is the mid-term exam score. The final exam score is the label (Y) that we are trying to predict.
The number of data points (m) is 10. There are 10 students in this dataset.
w = 1 and b = 10 will give the smallest cost function. You can try that manually or use this code to calculate the cost function for each set of values:

import numpy as np

# Given dataset
mid_term_scores = np.array([70, 80, 90, 60, 50, 40, 30, 20, 10, 0])
final_scores = np.array([80, 90, 100, 70, 65, 50, 40, 30, 20, 10])

# # Calculate the mean squared error (MSE) for given values of w and b

def calculate_mse(w, b):
    predicted_final = w * mid_term_scores + b
    m = len(mid_term_scores)
    mse = (1/(2*m)) * np.sum((predicted_final - final_scores)**2)
    return predicted_final, mse

# Test cases
w_values = [1, 0.5, 3]
b_values = [10, 20, 7]

for w, b in zip(w_values, b_values):
    predicted_final, mse = calculate_mse(w, b)
    print(f"For w = {w} and b = {b}:")
    print("Predicted Final Scores:", predicted_final)
    print("Mean Squared Error (MSE):", mse)
    print()

Artificial Intelligence

Exercise: One-variable Linear Regression