Neural Networks with Numpy for Absolute Beginners — Part 2: Linear Regression
In this tutorial, you will learn to implement Linear Regression for prediction using Numpy in detail and also visualize how the algorithm learns epoch by epoch. In addition to this, you will explore two layer Neural Networks.
By Suraj Donthi, Computer Vision Consultant & Course Instructor at DataCamp
In the previous tutorial, you got a very brief overview of a perceptron.
In this tutorial, you will dig deep into implementing a Linear Perceptron (Linear Regression) from which you’ll be able to predict the outcome of a problem!
This tutorial will apparently include a bit more of math as it is inevitable, but there’s no need to worry as I will explain them ground up. Regardless of this, it must be realized that all machine learning algorithms are basically mathematical formulations which are finally implemented in the form of code.
Before we start off, remember that we had used the threshold activation function to mimic the function of AND and NOR Gates?!
Here we will use another extremely simple activation function called linear activation function (equivalent to not having any activation!).
Let us find out the wonders that this activation function can do!
Linear Activation Function
Let’s assume that there is only one input and bias to the perceptron as shown below:
The resulting linear output (i.e., the sum) will be
. This is the equation of a straight line, as shown in the below figure.
It must be noted here that when no activation function is used, we can say that the activation function is linear.
This is a multivariate(multiple variables) linear equation.
Let us see how this is utilized for predicting the actual output of y in the next section i.e., Linear Regression.
Fitting a linear equation on a given set of data in n-dimensional space is called Linear Regression. The below GIF image shows an example of Linear Regression.
In simple words, you try to find the best values of m and b that best fits the set of points as shown in the above figure. When we have obtained the best possible fit, we can predict the y values given x.
A very popular example is the housing price prediction problem. In this problem you are given a set of values like the area of the house and the number of rooms etc. as features and you must predict the price of the house given these values.
So, the big question is… How does the prediction algorithm work? How does it learn to predict?
Let’s learn this on the go!
Let’s start by importing the required packages.
You’ll use the
sklearn dataset generator for creating the dataset. You will also use the package for splitting the data into training and test data. If you are not aware of
sklearn, it is a rich package with many machine learning algorithms. Although, you get pre-built functions for performing linear regression, you are going to build it from scratch in this tutorial.
For creating the dataset, you must first set a list of hyperparameters — while m and b are parameters, the number of samples, the number of input features, the number of neurons, the learning rate, the number of iterations/epochs for training etc. are called hyperparameters. You shall learn about these hyperparameters as you implement the algorithm.
For now, you shall set the number of training samples, the number of input features, the learning rate and epochs. You shall understand learning rate and epochs in a short while.
Your first task would be to import or generate the data. In this tutorial, you’ll generate the dataset using
sklearn's make_regression function.
For purpose of learning, we shall keep the number of features minimal so that it is easy to visualize. Hence, you must choose only one feature.
Now, it’s time to visualize what the data generator has cooked up!
Let’s check the shape of the vectors for consistency.
We need reset the size of y to
(200, 1) so that we do not get errors during vector multiplications.
Next you will have to split the dataset into train and test sets, so that you can test the accuracy of the regression model using a part of the dataset once you have trained the model.
Now let’s split the data into train set and test set.
In our case, the training set is 80% and the test set is 20%.
Let’s check the shape of the Train and Test datasets created.
As you can see, 80% of the data i.e., 80% of 200 data points is 160 which is correct.
So, what have we achieved till now?
We have done the initial data preprocessing and also explored the data through visualizing it. This is typically the first step while modeling any machine learning algorithm. We have also split the data for testing the accuracy of the model once it is trained.
What do we do next?
Clearly as shown in the above Linear Regression GIF image, we need to consider a random line at first and then fit it on the data through training.
Therefore, the next step is to randomly generate a line with a random slope and an intercept(bias). The goal is to achieve the best fit for the line.
Now, given m & b, we can plot the line generated.
Let’s update the function plot_graph to show the predicted line too.
Since the line is now generated, you’ll need to predict the values it is producing for a given value of x. From this value, all there is to do is to calculate their mean squared error. Why?
How could we find the difference between the actual output and the predicted output?
The simplest way would be to just subtract these two differences. We have a random line that gives an output
y_pred for every x that is given, but it’s surely not the actual output. Luckily, we have the actual output of all x too! So what we do is instead of taking the difference directly (which is technically called absolute distance or L1 distance), we square it (called the Euclidean distance or L2 distance) and take the mean for all the given points & this is called Mean Squared Error.
Let us now predict the values of
y_pred from the parameters m & b given the datapoints
X_train by defining a function