Inside Deep Learning: Computer Vision With Convolutional Neural Networks
Deep Learning-powered image recognition is now performing better than human vision on many tasks. We examine how human and computer vision extracts features from raw pixels, and explain how deep convolutional neural networks work so well.
This is often achieved by a technique known as max pooling. In max pooling, we divide the feature map into disjoint tiles and take the maximum activity of all the neurons in each tile. Max pooling the feature map makes the detection process cleaner by removing uninformative “halos” and also reduces the number of parameters in our convolutional neural network (combating the potential issues of over-fitting).
Putting these concepts together, we can start to tackle an interesting computer vision problem. Let’s say we have pictures of a patient’s blood smear and our goal is to detect a potential malaria invasion and diagnose the stage of infection (uninfected, early stage infection, or late stage infection). We might construct a convolutional network structured as follows:
This network has four convolutional layers followed by a traditional dense feed-forward network that ends in a 3-way soft-max (providing confidences for each of the three possible classes). Like a conventional feed-forward network, our convolutional network can be trained using stochastic gradient descent (and dropout in the dense layers to prevent over-fitting).
We are still fine-tuning our MalariaNet convolutional network, but preliminary results indicate that it performs better than traditional machine learning approaches (such as support vector machines and Bayesian classification methods). Moreover, recent work by Google and Baidu indicates that deep convolutional neural networks are logging accuracies are better than humans! It seems like convolutional networks are well poised to enable a large number of futuristic technologies.
If you’re interested in talking about some more cool work in this space, please feel free to drop me a line at
nkbuduma@gmail.com. I’m always excited to hear about new ideas.
Bio: Nikhil Buduma is a computer science student at MIT with deep interests in machine learning and the biomedical sciences.
He is a two time gold medalist at the International Biology Olympiad, a student researcher, and a “hacker.”
Related: