Neural Networks with Numpy for Absolute Beginners — Part 2: Linear Regression

In this tutorial, you will learn to implement Linear Regression for prediction using Numpy in detail and also visualize how the algorithm learns epoch by epoch. In addition to this, you will explore two layer Neural Networks.

By Suraj Donthi, Computer Vision Consultant & Course Instructor at DataCamp

In the previous tutorial, you got a very brief overview of a perceptron.

Neural Networks with Numpy for Absolute Beginners: Introduction

In this tutorial, you will dig deep into implementing a Linear Perceptron (Linear Regression) from which you’ll be able to predict the outcome of a problem!

This tutorial will apparently include a bit more of math as it is inevitable, but there’s no need to worry as I will explain them ground up. Regardless of this, it must be realized that all machine learning algorithms are basically mathematical formulations which are finally implemented in the form of code.

Before we start off, remember that we had used the threshold activation function to mimic the function of AND and NOR Gates?!

Here we will use another extremely simple activation function called linear activation function (equivalent to not having any activation!).

Let us find out the wonders that this activation function can do!


Linear Activation Function


Let’s assume that there is only one input and bias to the perceptron as shown below:


Computation graph of Linear Regression

The resulting linear output (i.e., the sum) will be

. This is the equation of a straight line, as shown in the below figure.


Graph of Linear Equation

It must be noted here that when no activation function is used, we can say that the activation function is linear.

This is a multivariate(multiple variables) linear equation.

Let us see how this is utilized for predicting the actual output of y in the next section i.e., Linear Regression.


Linear Regression


Fitting a linear equation on a given set of data in n-dimensional space is called Linear Regression. The below GIF image shows an example of Linear Regression.


Linear Regression [Source Link]

In simple words, you try to find the best values of m and b that best fits the set of points as shown in the above figure. When we have obtained the best possible fit, we can predict the y values given x.

A very popular example is the housing price prediction problem. In this problem you are given a set of values like the area of the house and the number of rooms etc. as features and you must predict the price of the house given these values.

So, the big question is… How does the prediction algorithm work? How does it learn to predict?

Let’s learn this on the go!

Let’s start by importing the required packages.

# Numpy for efficient Matrix and mathematical operations.
import numpy as np

# Pandas for table and other related operations
import pandas as pd

# Matplotlib for visualizing graphs
import matplotlib.pyplot as plt
from matplotlib.pylab import rcParams

# Sklearn for creating a dataset
from sklearn.datasets import make_regression

# train_test_split for splitting the data into training and testing data
from sklearn.model_selection import train_test_split

% matplotlib inline

# Set parameters for plotting
params = {'axes.titlesize': 'xx-large',               # Set title size
          'axes.labelsize': 'x-large',                # Set label size
          'figure.figsize': (8, 6)                    # Set a figure Size


You’ll use the sklearn dataset generator for creating the dataset. You will also use the package for splitting the data into training and test data. If you are not aware of sklearn, it is a rich package with many machine learning algorithms. Although, you get pre-built functions for performing linear regression, you are going to build it from scratch in this tutorial.

For creating the dataset, you must first set a list of hyperparameters — while m and b are parameters, the number of samples, the number of input features, the number of neurons, the learning rate, the number of iterations/epochs for training etc. are called hyperparameters. You shall learn about these hyperparameters as you implement the algorithm.

For now, you shall set the number of training samples, the number of input features, the learning rate and epochs. You shall understand learning rate and epochs in a short while.

# Sample size
M = 200

# No. of input features
n = 1

# Learning Rate
l_r = 0.05

# Number of iterations for updates
epochs = 51

Your first task would be to import or generate the data. In this tutorial, you’ll generate the dataset using sklearn's make_regression function.

For purpose of learning, we shall keep the number of features minimal so that it is easy to visualize. Hence, you must choose only one feature.

X, y = make_regression(n_samples=M, n_features=n, n_informative=n, 
                             n_targets=1, random_state=42, noise=10)

Now, it’s time to visualize what the data generator has cooked up!

def plot_graph(X, y):
    # Plot the original set of datapoints
    _ = plt.scatter(X, y, alpha=0.8)
    _ = plt.title('Plot of Datapoints generated')
    _ = plt.xlabel('x')
    _ = plt.ylabel('y')

plot_graph(X, y)

Let’s check the shape of the vectors for consistency.

print('Shape of vector X:', X.shape)
print('Shape of vector y:', y.shape)

Shape of vector X: (200, 1) Shape of vector y: (200,)

We need reset the size of y to (200, 1) so that we do not get errors during vector multiplications.

# Function to reset the sizes 
def reset_sizes(*args):
    return tuple(arg.reshape((arg.shape[0], 1)) for arg in args)

# Reset the size from (200,) -> (200, 1)
X, y = reset_sizes(X, y)

(200, 1)

Next you will have to split the dataset into train and test sets, so that you can test the accuracy of the regression model using a part of the dataset once you have trained the model.

Now let’s split the data into train set and test set.

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

In our case, the training set is 80% and the test set is 20%.

Let’s check the shape of the Train and Test datasets created.

print(X_train.shape, y_train.shape)
print(X_test.shape, y_test.shape)

(160, 1) (160, 1) (40, 1) (40, 1)

As you can see, 80% of the data i.e., 80% of 200 data points is 160 which is correct.

So, what have we achieved till now?

We have done the initial data preprocessing and also explored the data through visualizing it. This is typically the first step while modeling any machine learning algorithm. We have also split the data for testing the accuracy of the model once it is trained.

What do we do next?

Clearly as shown in the above Linear Regression GIF image, we need to consider a random line at first and then fit it on the data through training.

Therefore, the next step is to randomly generate a line with a random slope and an intercept(bias). The goal is to achieve the best fit for the line.

# Function to generate parameters of the linear regression model, m & b.
def init_params():
    m = np.random.normal(scale=10)
    b = np.random.normal(scale=10)
    return m, b

# Call function to generate paramets
m, b = init_params()

Now, given m & b, we can plot the line generated.

Let’s update the function plot_graph to show the predicted line too.

def plot_graph(dataset, pred_line=None):
    X, y = dataset['X'], dataset['y']
    # Plot the set of datapoints
    _ = plt.scatter(X, y, alpha=0.8)                                
    if(pred_line != None):
        x_line, y_line = pred_line['x_line'], pred_line['y_line']
        # Plot the randomly generated line
        _ = plt.plot(x_line, y_line, linewidth=2, markersize=12, color='red', alpha=0.8)
        _ = plt.title('Random Line on set of Datapoints')
        _ = plt.title('Plot of Datapoints')
    _ = plt.xlabel('x')
    _ = plt.ylabel('y')

# Function to plot predicted line
def plot_pred_line(X, y, m, b):
    # Generate a set of datapoints on x for creating a line.
    x_line = np.linspace(np.min(X), np.max(X), 10)

    # Calculate the corresponding y with random values of m & b
    y_line = m * x_line + b
    dataset = {'X': X, 'y': y}
    pred_line = {'x_line': x_line, 'y_line':y_line}
    plot_graph(dataset, pred_line)

plot_pred_line(X_train, y_train, m, b)


Since the line is now generated, you’ll need to predict the values it is producing for a given value of x. From this value, all there is to do is to calculate their mean squared error. Why?

How could we find the difference between the actual output and the predicted output?

The simplest way would be to just subtract these two differences. We have a random line that gives an output y_pred for every x that is given, but it’s surely not the actual output. Luckily, we have the actual output of all x too! So what we do is instead of taking the difference directly (which is technically called absolute distance or L1 distance), we square it (called the Euclidean distance or L2 distance) and take the mean for all the given points & this is called Mean Squared Error.

Let us now predict the values of y_pred from the parameters m & b given the datapoints X_train by defining a function forward_prop.

def forward_prop(X, m, b):
    y_pred = m * X + b
    return y_pred

y_pred = forward_prop(X_train, m, b)