Simple Derivatives with PyTorch
PyTorch includes an automatic differentiation package, autograd, which does the heavy lifting for finding derivatives. This post explores simple derivatives using autograd, outside of neural networks.
Derivatives are simple with PyTorch. Like many other neural network libraries, PyTorch includes an automatic differentiation package,
autograd, which does the heavy lifting. But derivatives seem especially simple with PyTorch.
One of the things I wish I had when first learning about how derivatives and practical implementations of neural networks fit together were concrete examples of using such neural network packages to find simple derivatives and perform calculations on them, separate from computation graphs in neural networks. PyTorch's architecture makes such pedagogical examples easy.
I can't really tell if this will be useful for people who want to see how PyTorch implements automatic differentiation, how to practically compute derivatives, or even learning what "finding the derivative" means, but let's give it a go anyways. There's a chance it isn't useful for any of these. :)
First we will need a function for which to find the derivative. Arbitrarily, let's use this:
We would do well to recall here that the derivative of a function can be interpreted as the slope of a tangent to the curve represented by our function, as well as the function's rate of change.
Before we use PyTorch to find the derivative to this function, let's work it out first by hand:
The above is the first order derivative of our original function.
Now let's find the value of our derivative function for a given value of x. Let's arbitrarily use 2:
Solving our derivative function for x = 2 gives as 233. This can be interpreted as the rate of change of y with respect to x in our formula is 233 when when x = 2.
autograd to Find and Solve a Derivative
How can we do the same as above with PyTorch's
First, it should be obvious that we have to represent our original function in Python as such:
Line by line, the above code:
- imports the torch library
- defines the function we want to compute the derivative of
- defines the value (2) we want to compute the derivative with regard to as a PyTorch
Variableobject and specifies that it should be instantiated in such a way that it tracks where in the computation graph it connects to in order to perform differentiation by the chain rule (
backward()to compute the sum of gradients, using the chain rule
- outputs the value stored in the x tensor's
gradattribute, which, as shown below
This value, 233, matches what we calculated by hand, above.
To take the next steps with using PyTorch, including using the
autograd package and
Tensor objects to build some basic neural networks, I suggest the official PyTorch 60 Minute Blitz or this tutorial.