# Introduction to PyTorch for Deep Learning

In this tutorial, you’ll get an introduction to deep learning using the PyTorch framework, and by its conclusion, you’ll be comfortable applying it to your deep learning models.

**By Derrick Mwiti, Data Analyst**

In this tutorial, you’ll get an introduction to **deep learning using the PyTorch framework**, and by its conclusion, you’ll be comfortable applying it to your deep learning models. Facebook launched PyTorch 1.0 early this year with integrations for Google Cloud, AWS, and Azure Machine Learning. In this tutorial, I assume that you’re already familiar with Scikit-learn, Pandas, NumPy, and SciPy. These packages are important prerequisites for this tutorial.

### Plan of Attack

- What is deep learning?
- Introduction to PyTorch
- Why you’d prefer PyTorch to other Python Deep Learning Libraries
- PyTorch Tensors
- PyTorch Autograd
- PyTorch nn Module
- PyTorch optim Package
- Custom nn Modules in PyTorch
- Putting it all Together and Further Reading

### What is Deep Learning?

Deep learning is a subfield of machine learning with algorithms inspired by the working of the human brain. These algorithms are referred to as artificial neural networks. Examples of these neural networks include Convolutional Neural Networks that are used for image classification, Artificial Neural Networks and Recurrent Neural Networks.

### Introduction to PyTorch

PyTorch is a Python machine learning package based on Torch, which is an open-source machine learning package based on the programming language Lua. PyTorch has two main features:

- Tensor computation (like NumPy) with strong GPU acceleration
- Automatic differentiation for building and training neural networks

### Why you might prefer PyTorch to other Python deep learning libraries

There are a few reason you might prefer PyTorch to other deep learning libraries:

- Unlike other libraries like TensorFlow where you have to first define an entire computational graph before you can run your model, PyTorch allows you to define your graph dynamically.
- PyTorch is also great for deep learning research and provides maximum flexibility and speed.

### PyTorch Tensors

**PyTorch Tensors** are very similar to NumPy arrays with the addition that they can run on the GPU. This is important because it helps accelerate numerical computations, which can increase the speed of neural networks by 50 times or greater. In order to use PyTorch, you’ll need to head over tohttps://PyTorch.org/ and install PyTorch. If you’re using Conda, you can install PyTorch by running this simple command:

In order to define a PyTorch tensor, start by importing the `torch`

package. PyTorch allows you to define two types of tensors — a CPU and GPU tensor. For this tutorial, I’ll assume you’re running a CPU machine, but I’ll also show you how to define tensors in a GPU:

The default tensor type in PyTorch is a float tensor defined as **torch.FloatTensor****. **As an example, you’ll create a tensor from a Python list:

If you’re using a GPU-enabled machine, you’ll define the tensor as shown below:

You can also perform mathematical computations such as addition and subtraction using PyTorch tensors:

You can also define matrices and perform matrix operations. Let’s see how you’d define a matrix and transpose it:

### PyTorch Autograd

PyTorch uses a technique called **automatic differentiation** that numerically evaluates the derivative of a function. Automatic differentiation computes backward passes in neural networks. In training neural networks weights are randomly initialized to numbers that are near zero but not zero. **Backward pass** is the process by which these weights are adjusted from right to left, and a forward pass is the inverse (left to right).

`torch.autograd`

is the library that supports automatic differentiation in PyTorch. The central class of this package is `torch.Tensor`

**.** To track all operations on it, set `.requires_grad`

* *as* *`True`

**. **To compute all gradients, call `.backward()`

**.** The gradient for this tensor will be accumulated in the **.**grad** **attribute.

If you want to detach a tensor from computation history, call the

function. This will also prevent future computations on the tensor from being tracked. Another way to prevent history tracking is by wrapping your code with **.**detach()`torch.no_grad():`

The `Tensor`

and `Function`

** **classes are interconnected to build an acyclic graph that encodes a complete history of the computation. The `.grad_fn`

attribute of the tensor references the `Function`

** **that created the tensor. To compute derivatives, call `.backward()`

* *on a* *`Tensor`

*.*** **If the `Tensor`

** **contains one element, you don’t have to specify any parameters for the `backward()`

function. If the `Tensor`

** **contains more than one element, specify a gradient that’s a tensor of matching shape.

As an example, you’ll create two tensors, one with `requires_grad`

** **as `True`

and the other as `False`

. You’ll then use these two tensors to perform addition and sum operations. Thereafter, you’ll compute the gradient of one of the tensors.

Calling **.**grad** **on `b`

will return nothing since you didn’t set `requires_grad`

to `True`

on it.

### PyTorch nn Module

This is the module for building neural networks in PyTorch. `nn`

depends on `autograd`

to define models and differentiate them. Let’s start by defining the procedure for **training a neural network**:

- Define the neural network with some learnable parameters, referred to as weights.
- Iterate over a dataset of inputs.
- Process input through the network.
- Compare predicted results to actual values and measure the error.
- Propagate gradients back into the network’s parameters.
- Update the weights of the network using a simple update rule:

`weight = weight — learning_rate * gradient`

You’ll now use the `nn`

package to create a two-layer neural network:

Let’s explain some of the parameters used above:

`N`

is batch size. Batch size is the number of observations after which the weights will be updated.`D_in`

is the input dimension`H`

is the hidden dimension`D_out`

is the output dimension`torch.randn`

defines a matrix of the specified dimensions`torch.nn.Sequential`

initializes a linear stack of layers`torch.nn.Linear`

`torch.nn.ReLU`

`torch.nn.MSELoss`

### PyTorch optim Package

Next you’ll use the `optim`

package to **define an optimizer** that will update the weights for you. The `optim`

package abstracts the idea of an optimization algorithm and provides implementations of commonly used optimization algorithms such as AdaGrad, RMSProp and Adam. We’ll use the Adam optimizer, which is one of the more popular optimizers.

The first argument this optimizer takes is the tensors, which it should update. In the forward pass you’ll compute the predicted y by passing x to the model. After that, compute and print the loss. Before running the backward pass, zero all the gradients for the variables that will be updated using the optimizer. This is done because, by default, gradients are not overwritten when `.backward()`

** **is called. Thereafter, call the step function on the optimizer, and this updates its parameters. How you’d implement this is shown below,

### Custom nn Modules in PyTorch

Sometimes you’ll need to build your own custom modules. In these cases you’ll subclass the `nn.Module`

**. **You’ll then need to define a `forward`

that will receive input tensors and produce output tensors. How to implement a two-layer network using ** **`nn.Module`

* *is shown below*.* The model is very similar to the one above, but the difference is you’ll use `torch.nn.Module`

** **to create the neural network. The other difference is the use of `stochastic gradient descent optimizer`

* *instead of Adam. You can implement a custom nn module as shown below:

### Putting it all Together and Further Reading

PyTorch allows you to implement different types of layers such as convolutional layers, recurrent layers, and linear layers, among others. You can learn more about PyTorch from its official documentation.

**Bio: Derrick Mwiti** is a data analyst, a writer, and a mentor. He is driven by delivering great results in every task, and is a mentor at Lapid Leaders Africa.

Original. Reposted with permission.

**Related:**