TensorFlow: Building Feed-Forward Neural Networks Step-by-Step
This article will take you through all steps required to build a simple feed-forward neural network in TensorFlow by explaining each step in details.
In this article, two basic feed-forward neural networks (FFNNs) will be created using TensorFlow deep learning library in Python. The reader should have basic understanding of how neural networks work and its concepts in order to apply them programmatically.
This article will take you through all steps required to build a simple feed-forward neural network in TensorFlow by explaining each step in details. Before actual building of the neural network, some preliminary steps are recommended to be discussed.
The summarized steps are as follows:
- Reading the training data (inputs and outputs)
- Building and connect the neural networks layers (this included preparing weights, biases, and activation function of each layer)
- Building a loss function to assess the prediction error
- Create a training loop for training the network and updating its parameters
- Applying some testing data to assess the network prediction accuracy
Here is the first classification problem that we are to solve using neural network.
It is a binary classification problem to classify colors into either red or blue based on the three RGB color channels. It can be solved linearly and thus we don`t have to use hidden layers. Just input and output layers are to be used. There will be a single neuron in the output layer with an activation function. The network architecture is shown in the following figure (Figure 1):
Where X0=1 is the bias and W0 is its weight. W1 , W2, and W3 are the weights for the three inputs R (Red), G (Green), and B (Blue).
Here is the complete code of the neural network solving that problem to be discussed later. For easy access, this code is called CodeSample1.
Reading the Training Data
The data is read in the previous code in lines 4 and 5 using something called placeholder. But what is a placeholder? Why we have not just used a NumPy array for preparing the data? To answer these questions, we can explore a simpler example that reads some inputs and print it to the console as follows:
The input is read into a NumPy array away from TensorFlow as in line 4. But TensorFlow just know Tensors and just we have to convert the NumPy array into a Tensor. The tensorflow.convert_to_tensor() TensorFlow operation does that conversion as in line 9. To be able to print the contents of a Tensor, we must at first create a Session using the tensorflow.Session() class as in line 12. In line 15, the session runs in order evaluate the Tensor training_inputs and get its values printed. Finally, the session got closed in line 18. The result of printing is as follows:
This example doesn`t use placeholders. So, what is the use of a TensorFlow placeholder? Assume that we want to run the session with another input. To do that, we have to modify the numpy_input Python variable each time a new input is applied.
It is not a good way to modify the code in order to get different inputs. A better way for doing that is to just create the Tensor and then modify its value without modifying it in the code. This is the job of the TensorFlow placeholder.
Placeholder in TensorFlow is a way for accepting the input data. It is created in the code and modified multiple times in the Session running time. The following code modifies the previous code to use placeholders:
This code prints the same outputs as before but it uses a placeholder as in line 4. The placeholder is created by specifying the data type and the shape of the data it will accept. The shape can be specified to restrict the input data to be of specific size. If no shape specified, then different inputs with different shapes can be assigned to the placeholder. The placeholder is assigned a value when running the Session using the feed_dict argument of the run operation. feed_dict is a dictionary used to initialize the placeholders.
But assume there is a feature vector of 50 feature and we have a dataset of 100 samples. Assume we want to train a model two times with different number of samples, say 30 and 40. Here the size of the training set has one dimension fixed (number of features=number of columns) and another dimension (number of rows=number of training samples) of variable size. Setting its size to 30, then we restrict the input to be of size (30, 50) and thus we won`t be able to re-train the model with 40 samples. The same holds for using 40 as number of rows. The solution is to just set the number of columns but leave the number of rows unspecified by setting it to None as follows:
One benefit of using placeholder is that its value is modified easily. You have not to modify the program in order the use different inputs. It is like a variable in Java, C++, or Python but it is not exactly a variable in TensorFlow. We can run the session multiple times with different values for the placeholder:
To do that using NumPy arrays we have to create a new Python array for each new input we are to run the program with.
This is why we are using placeholders for feeding the data. For every input there should be a separate placeholder. In out neural network, there are two inputs which are training inputs and training outputs and thus there should be two placeholders one for each as in lines 4 and 5 in CodeSample1.
Note that the size of these placeholders is not fixed to allow variable number of training samples to be used with the code unchanged. But both placeholders of inputs and outputs training data must have the same number of rows. For example, according to our currently presented training data, training_inputs should have a shape=(4, 2) and training_outputs should be of shape=(4, 1).