Learning to Code Neural Networks
Learn how to code a neural network, by taking advantage of someone else's experiences learning how to code a neural network.
Step 3: Understanding backpropagation
Understanding how a neural network works from input to output isn’t that difficult to understand, at least conceptually.
More difficult though, is understanding how the neural network actually learns from looking at a set of data samples.
The concept is called backpropagation.
This essentially means that you look at how wrong the network guessed, and then adjust the networks weights accordingly.
The weights were the blue numbers on our neuron in the beginning of the article.
This process happens backwards, because you start at the end of the network (observe how wrong the networks ‘guess’ is), and then move backwards through the network, while adjusting the weights on the way, until you finally reach the inputs.
To calculate this by hand requires some calculus, as it involves getting some derivatives of the networks’ weights. The Kahn Academy calculus courses seems like a good way to start, though I haven’t used them myself, as I took calculus on university.
Note: there are a lot of libraries that calculates the derivatives for you, so if you’d like to start coding neural networks before completely understanding the math, you’re fully able to do this as well.
Screenshot from Matt Mazur's tutorial on backpropagation.
The three best sources I found for understanding backpropagation are these:
- A Step by Step Backpropagation Example - by Matt Mazur
- Hackers Guide to Neural Nets - by Andrej Karpathy
- Neural Networks And Deep Learning - by Michael Nielsen
You should definitely code along while you’re reading the articles, especially the two first ones. It’ll give you some sample code to look back at when you’re confused in the future.
Plus, I can’t really emphasize this enough:
You don’t learn much by reading about neural nets, you need to practice it to make the knowledge stick.
The third article is also fantastic, but I’ve used this more as a wiki than a plain tutorial, as it’s actually an entire book. It contains thorough explanations all the important concepts in neural networks.
These articles will also help you understand important concepts as cost functions and gradient descent, which play equally important roles in neural networks.
Step 4: Coding your own neural networks
In some articles and tutorials you’ll actually end up coding small neural networks. As soon as you’re comfortable with that, I recommend you to go all in on this strategy. It’s both fun and an extremely effective way of learning.
Screenshot from the IAmTrask tutorial.
After you’ve coded along with this example, you should do as the article states at the bottom, which is to implement it once again without looking at the tutorial. This forces you to really understand the concepts, and will likely reveal holes in your knowledge, which isn’t fun. However, when you finally manage it, you’ll feel like you’ve just acquired a new superpower.
A little side note: When doing exercises I was often confused by the vectorized implementations some tutorials use, as it requires a little bit of linear algebra to understand. Once again, I turned myself back to the Coursera ML course, as Week 1 contains a full section of linear algebra review. This helps you to understand how matrixes and vectors are multiplied in the networks.
Screenshot from the WildML tutorial.
At this point, you could either try and code your own neural network from scratch or start playing around with some of the networks you have coded up already. It’s great fun to find a dataset that interests you and try to make some predictions with your neural nets.
To get a hold of a dataset, just visit my side project Datasets.co (← shameless self promotion) and find one you like.
Visit Datasets.co to get hold of a dataset.
Anyway, the point is that you’re now better off experimenting with stuff that interests you rather than following my advices.
Personally, I’m currently learning how to use Python libraries that makes it easier to code up neural networks, like Theano, Lasagne and nolearn. I’m using this to do challenges on Kaggle, which is both great fun and great learning.
Bio: Per Harald Borgen is a developer at Xeneta. He mostly writes about learning new stuff.
Original. Reposted with permission.