Submit a blog to KDnuggets -- Top Blogs Win A Reward

Topics: AI | Data Science | Data Visualization | Deep Learning | Machine Learning | NLP | Python | R | Statistics

KDnuggets Home » News » 2019 » Feb » Tutorials, Overviews » Artificial Neural Network Implementation using NumPy and Image Classification ( 19:n09 )

# Artificial Neural Network Implementation using NumPy and Image Classification

This tutorial builds artificial neural network in Python using NumPy from scratch in order to do an image classification application for the Fruits360 dataset

### ANN Implementation

The next figure visualizes the target ANN structure. There is an input layer with 102 inputs, 2 hidden layers with 150 and 60 neurons, and an output layer with 4 outputs (one for each fruit class).

The input vector at any layer is multiplied (matrix multiplication) by the weights matrix connecting it to the next layer to produce an output vector. Such an output vector is again multiplied by the weights matrix connecting its layer to the next layer. The process continues until reaching the output layer. Summary of the matrix multiplications is in the next figure.

The input vector of size 1x102 is to be multiplied by the weights matrix of the first hidden layer of size 102x150. Remember it is matrix multiplication. Thus, the output array shape is 1x150. Such output is then used as the input to the second hidden layer, where it is multiplied by a weights matrix of size 150x60. The result size is 1x60. Finally, such output is multiplied by the weights between the second hidden layer and the output layer of size 60x4. The result finally has a size of 1x4. Every element in such resulted vector refers to an output class. The input sample is labeled according to the class with the highest score.

The Python code for implementing such multiplications is in listed below.

import numpy
import pickle

def sigmoid(inpt):
return 1.0 / (1 + numpy.exp(-1 * inpt))

f = open("dataset_features.pkl", "rb")
f.close()

features_STDs = numpy.std(a=data_inputs2, axis=0)
data_inputs = data_inputs2[:, features_STDs > 50]

f = open("outputs.pkl", "rb")
f.close()

HL1_neurons = 150
input_HL1_weights = numpy.random.uniform(low=-0.1, high=0.1,
size=(data_inputs.shape[1], HL1_neurons))

HL2_neurons = 60
HL1_HL2_weights = numpy.random.uniform(low=-0.1, high=0.1,
size=(HL1_neurons, HL2_neurons))

output_neurons = 4
HL2_output_weights = numpy.random.uniform(low=-0.1, high=0.1,
size=(HL2_neurons, output_neurons))

H1_outputs = numpy.matmul(a=data_inputs[0, :], b=input_HL1_weights)
H1_outputs = sigmoid(H1_outputs)
H2_outputs = numpy.matmul(a=H1_outputs, b=HL1_HL2_weights)
H2_outputs = sigmoid(H2_outputs)
out_otuputs = numpy.matmul(a=H2_outputs, b=HL2_output_weights)

predicted_label = numpy.where(out_otuputs == numpy.max(out_otuputs))[0][0]
print("Predicted class : ", predicted_label)

After reading the previously saved features and their output labels and filtering the features, the weights matrices of the layers are defined. They are randomly given values from -0.1 to 0.1. For example, the variable "input_HL1_weights" holds the weights matrix between the input layer and the first hidden layer. Size of such matrix is defined according to the number of feature elements and the number of neurons in the hidden layer.

After creating the weights matrices, next is to apply matrix multiplications. For example, the variable "H1_outputs" holds the output of multiplying the feature vector of a given sample to the weights matrix between the input layer and the first hidden layer.

Usually, an activation function is applied to the outputs of each hidden layer to create a non-linear relationship between the inputs and the outputs. For example, outputs of the matrix multiplications are applied to the sigmoid activation function.

After generating the output layer outputs, prediction takes place. The predicted class label is saved into the "predicted_label" variable. Such steps are repeated for each input sample. The complete code that works across all samples is given below.

import numpy
import pickle

def sigmoid(inpt):
return 1.0 / (1 + numpy.exp(-1 * inpt))

def relu(inpt):
result = inpt
result[inpt < 0] = 0
return result

def update_weights(weights, learning_rate):
new_weights = weights - learning_rate * weights
return new_weights

def train_network(num_iterations, weights, data_inputs, data_outputs, learning_rate, activation="relu"):
for iteration in range(num_iterations):
print("Itreation ", iteration)
for sample_idx in range(data_inputs.shape[0]):
r1 = data_inputs[sample_idx, :]
for idx in range(len(weights) - 1):
curr_weights = weights[idx]
r1 = numpy.matmul(a=r1, b=curr_weights)
if activation == "relu":
r1 = relu(r1)
elif activation == "sigmoid":
r1 = sigmoid(r1)
curr_weights = weights[-1]
r1 = numpy.matmul(a=r1, b=curr_weights)
predicted_label = numpy.where(r1 == numpy.max(r1))[0][0]
desired_label = data_outputs[sample_idx]
if predicted_label != desired_label:
weights = update_weights(weights,
learning_rate=0.001)
return weights

def predict_outputs(weights, data_inputs, activation="relu"):
predictions = numpy.zeros(shape=(data_inputs.shape[0]))
for sample_idx in range(data_inputs.shape[0]):
r1 = data_inputs[sample_idx, :]
for curr_weights in weights:
r1 = numpy.matmul(a=r1, b=curr_weights)
if activation == "relu":
r1 = relu(r1)
elif activation == "sigmoid":
r1 = sigmoid(r1)
predicted_label = numpy.where(r1 == numpy.max(r1))[0][0]
predictions[sample_idx] = predicted_label
return predictions

f = open("dataset_features.pkl", "rb")
f.close()

features_STDs = numpy.std(a=data_inputs2, axis=0)
data_inputs = data_inputs2[:, features_STDs > 50]

f = open("outputs.pkl", "rb")
f.close()

HL1_neurons = 150
input_HL1_weights = numpy.random.uniform(low=-0.1, high=0.1,
size=(data_inputs.shape[1], HL1_neurons))

HL2_neurons = 60
HL1_HL2_weights = numpy.random.uniform(low=-0.1, high=0.1,
size=(HL1_neurons, HL2_neurons))

output_neurons = 4
HL2_output_weights = numpy.random.uniform(low=-0.1, high=0.1,
size=(HL2_neurons, output_neurons))

weights = numpy.array([input_HL1_weights,
HL1_HL2_weights,
HL2_output_weights])

weights = train_network(num_iterations=10,
weights=weights,
data_inputs=data_inputs,
data_outputs=data_outputs,
learning_rate=0.01,
activation="relu")

predictions = predict_outputs(weights, data_inputs)
num_flase = numpy.where(predictions != data_outputs)[0]
print("num_flase ", num_flase.size)

The "weights" variables hold all weights across the entire network. Based on the size of each weight matrix, the network structure is dynamically specified. For example, if the size of the "input_HL1_weights" variable is 102x80, then we can deduce that the first hidden layer has 80 neurons.

The "train_network" is the core function as it trains the network by looping through all samples. For each sample, the steps discussed in listing 3-6 are applied. It accepts the number of training iterations, feature, output labels, weights, learning rate, and the activation function. There are two options for the activation functions which are either ReLU or sigmoid. ReLU is a thresholding function that returns the same input as long as it is greater than zero. Otherwise, it returns zero.

If the network made a false prediction for a given sample, then weights are updated using the "update_weights" function. No optimization algorithm is used to update the weights. Weights are simply updated according to the learning rate. The accuracy does not exceed 45%. For achieving better accuracy, an optimization algorithm is used for updating the weights. For example, you can find the gradient descent technique in the ANN implementation of the scikit-learn library.

In my book, you can find a guide for optimizing the ANN weights using the genetic algorithm (GA) optimization technique which increases the classification accuracy. You can read more about GA from the following resources I prepared:

### For contacting the author

Original. Reposted with permission.

Related:

Top Stories Past 30 Days