Artificial Neural Network Implementation using NumPy and Image Classification
This tutorial builds artificial neural network in Python using NumPy from scratch in order to do an image classification application for the Fruits360 dataset
The next figure visualizes the target ANN structure. There is an input layer with 102 inputs, 2 hidden layers with 150 and 60 neurons, and an output layer with 4 outputs (one for each fruit class).
The input vector at any layer is multiplied (matrix multiplication) by the weights matrix connecting it to the next layer to produce an output vector. Such an output vector is again multiplied by the weights matrix connecting its layer to the next layer. The process continues until reaching the output layer. Summary of the matrix multiplications is in the next figure.
The input vector of size 1x102 is to be multiplied by the weights matrix of the first hidden layer of size 102x150. Remember it is matrix multiplication. Thus, the output array shape is 1x150. Such output is then used as the input to the second hidden layer, where it is multiplied by a weights matrix of size 150x60. The result size is 1x60. Finally, such output is multiplied by the weights between the second hidden layer and the output layer of size 60x4. The result finally has a size of 1x4. Every element in such resulted vector refers to an output class. The input sample is labeled according to the class with the highest score.
The Python code for implementing such multiplications is in listed below.
After reading the previously saved features and their output labels and filtering the features, the weights matrices of the layers are defined. They are randomly given values from -0.1 to 0.1. For example, the variable "input_HL1_weights" holds the weights matrix between the input layer and the first hidden layer. Size of such matrix is defined according to the number of feature elements and the number of neurons in the hidden layer.
After creating the weights matrices, next is to apply matrix multiplications. For example, the variable "H1_outputs" holds the output of multiplying the feature vector of a given sample to the weights matrix between the input layer and the first hidden layer.
Usually, an activation function is applied to the outputs of each hidden layer to create a non-linear relationship between the inputs and the outputs. For example, outputs of the matrix multiplications are applied to the sigmoid activation function.
After generating the output layer outputs, prediction takes place. The predicted class label is saved into the "predicted_label" variable. Such steps are repeated for each input sample. The complete code that works across all samples is given below.
The "weights" variables hold all weights across the entire network. Based on the size of each weight matrix, the network structure is dynamically specified. For example, if the size of the "input_HL1_weights" variable is 102x80, then we can deduce that the first hidden layer has 80 neurons.
The "train_network" is the core function as it trains the network by looping through all samples. For each sample, the steps discussed in listing 3-6 are applied. It accepts the number of training iterations, feature, output labels, weights, learning rate, and the activation function. There are two options for the activation functions which are either ReLU or sigmoid. ReLU is a thresholding function that returns the same input as long as it is greater than zero. Otherwise, it returns zero.
If the network made a false prediction for a given sample, then weights are updated using the "update_weights" function. No optimization algorithm is used to update the weights. Weights are simply updated according to the learning rate. The accuracy does not exceed 45%. For achieving better accuracy, an optimization algorithm is used for updating the weights. For example, you can find the gradient descent technique in the ANN implementation of the scikit-learn library.
In my book, you can find a guide for optimizing the ANN weights using the genetic algorithm (GA) optimization technique which increases the classification accuracy. You can read more about GA from the following resources I prepared:
Introduction to Optimization with Genetic Algorithm
Genetic Algorithm (GA) Optimization - Step-by-Step Example
Genetic Algorithm Implementation in Python
For contacting the author
- LinkedIn: https://www.linkedin.com/in/ahmedfgad
- Facebook: https://www.facebook.com/ahmed.f.gadd
- Twitter: https://twitter.com/ahmedfgad
- Towards Data Science: https://towardsdatascience.com/@ahmedfgad
- KDnuggets: https://www.kdnuggets.com/author/ahmed-gad
- E-mail: firstname.lastname@example.org
Original. Reposted with permission.
- Neural Networks - an Intuition
- How to Create a Simple Neural Network in Python
- Building Convolutional Neural Network using NumPy from Scratch