Introduction to PyTorch for Deep Learning
In this tutorial, you’ll get an introduction to deep learning using the PyTorch framework, and by its conclusion, you’ll be comfortable applying it to your deep learning models.
In this tutorial, you’ll get an introduction to deep learning using the PyTorch framework, and by its conclusion, you’ll be comfortable applying it to your deep learning models. Facebook launched PyTorch 1.0 early this year with integrations for Google Cloud, AWS, and Azure Machine Learning. In this tutorial, I assume that you’re already familiar with Scikit-learn, Pandas, NumPy, and SciPy. These packages are important prerequisites for this tutorial.
Plan of Attack
- What is deep learning?
- Introduction to PyTorch
- Why you’d prefer PyTorch to other Python Deep Learning Libraries
- PyTorch Tensors
- PyTorch Autograd
- PyTorch nn Module
- PyTorch optim Package
- Custom nn Modules in PyTorch
- Putting it all Together and Further Reading
What is Deep Learning?
Deep learning is a subfield of machine learning with algorithms inspired by the working of the human brain. These algorithms are referred to as artificial neural networks. Examples of these neural networks include Convolutional Neural Networks that are used for image classification, Artificial Neural Networks and Recurrent Neural Networks.
Introduction to PyTorch
- Tensor computation (like NumPy) with strong GPU acceleration
- Automatic differentiation for building and training neural networks
Why you might prefer PyTorch to other Python deep learning libraries
There are a few reason you might prefer PyTorch to other deep learning libraries:
- Unlike other libraries like TensorFlow where you have to first define an entire computational graph before you can run your model, PyTorch allows you to define your graph dynamically.
- PyTorch is also great for deep learning research and provides maximum flexibility and speed.
PyTorch Tensors are very similar to NumPy arrays with the addition that they can run on the GPU. This is important because it helps accelerate numerical computations, which can increase the speed of neural networks by 50 times or greater. In order to use PyTorch, you’ll need to head over tohttps://PyTorch.org/ and install PyTorch. If you’re using Conda, you can install PyTorch by running this simple command:
In order to define a PyTorch tensor, start by importing the
torch package. PyTorch allows you to define two types of tensors — a CPU and GPU tensor. For this tutorial, I’ll assume you’re running a CPU machine, but I’ll also show you how to define tensors in a GPU:
The default tensor type in PyTorch is a float tensor defined as
torch.FloatTensor. As an example, you’ll create a tensor from a Python list:
If you’re using a GPU-enabled machine, you’ll define the tensor as shown below:
You can also perform mathematical computations such as addition and subtraction using PyTorch tensors:
You can also define matrices and perform matrix operations. Let’s see how you’d define a matrix and transpose it:
PyTorch uses a technique called automatic differentiation that numerically evaluates the derivative of a function. Automatic differentiation computes backward passes in neural networks. In training neural networks weights are randomly initialized to numbers that are near zero but not zero. Backward pass is the process by which these weights are adjusted from right to left, and a forward pass is the inverse (left to right).
torch.autograd is the library that supports automatic differentiation in PyTorch. The central class of this package is
torch.Tensor. To track all operations on it, set
True. To compute all gradients, call
.backward(). The gradient for this tensor will be accumulated in the
If you want to detach a tensor from computation history, call the
.detach()function. This will also prevent future computations on the tensor from being tracked. Another way to prevent history tracking is by wrapping your code with
Function classes are interconnected to build an acyclic graph that encodes a complete history of the computation. The
.grad_fnattribute of the tensor references the
Function that created the tensor. To compute derivatives, call
.backward() on a
Tensor. If the
Tensor contains one element, you don’t have to specify any parameters for the
backward()function. If the
Tensor contains more than one element, specify a gradient that’s a tensor of matching shape.
As an example, you’ll create two tensors, one with
Trueand the other as
False. You’ll then use these two tensors to perform addition and sum operations. Thereafter, you’ll compute the gradient of one of the tensors.
b will return nothing since you didn’t set
True on it.
PyTorch nn Module
This is the module for building neural networks in PyTorch.
nn depends on
autograd to define models and differentiate them. Let’s start by defining the procedure for training a neural network:
- Define the neural network with some learnable parameters, referred to as weights.
- Iterate over a dataset of inputs.
- Process input through the network.
- Compare predicted results to actual values and measure the error.
- Propagate gradients back into the network’s parameters.
- Update the weights of the network using a simple update rule:
weight = weight — learning_rate * gradient
You’ll now use the
nn package to create a two-layer neural network:
Let’s explain some of the parameters used above:
Nis batch size. Batch size is the number of observations after which the weights will be updated.
D_inis the input dimension
His the hidden dimension
D_outis the output dimension
torch.randndefines a matrix of the specified dimensions
torch.nn.Sequentialinitializes a linear stack of layers
torch.nn.Linearapplies a linear transformation to the incoming data
torch.nn.ReLUapplies the rectified linear unit function element-wise
torch.nn.MSELosscreates a criterion that measures the mean squared error between n elements in the input x and target y
PyTorch optim Package
Next you’ll use the
optim package to define an optimizer that will update the weights for you. The
optim package abstracts the idea of an optimization algorithm and provides implementations of commonly used optimization algorithms such as AdaGrad, RMSProp and Adam. We’ll use the Adam optimizer, which is one of the more popular optimizers.
The first argument this optimizer takes is the tensors, which it should update. In the forward pass you’ll compute the predicted y by passing x to the model. After that, compute and print the loss. Before running the backward pass, zero all the gradients for the variables that will be updated using the optimizer. This is done because, by default, gradients are not overwritten when
.backward() is called. Thereafter, call the step function on the optimizer, and this updates its parameters. How you’d implement this is shown below,
Custom nn Modules in PyTorch
Sometimes you’ll need to build your own custom modules. In these cases you’ll subclass the
nn.Module. You’ll then need to define a
forward that will receive input tensors and produce output tensors. How to implement a two-layer network using
nn.Module is shown below. The model is very similar to the one above, but the difference is you’ll use
torch.nn.Module to create the neural network. The other difference is the use of
stochastic gradient descent optimizer instead of Adam. You can implement a custom nn module as shown below:
Putting it all Together and Further Reading
PyTorch allows you to implement different types of layers such as convolutional layers, recurrent layers, and linear layers, among others. You can learn more about PyTorch from its official documentation.
Bio: Derrick Mwiti is a data analyst, a writer, and a mentor. He is driven by delivering great results in every task, and is a mentor at Lapid Leaders Africa.
Original. Reposted with permission.
Top Stories Past 30 Days