Top /r/MachineLearning Posts, February: Oxford Deep NLP Course; Data Visualization for Scikit-learn Results

Oxford Deep NLP Course; scikit-plot: Data Visualization for Scikit-learn Results; Machine Learning at Berkeley's ML Crash Course: Neural Networks; Predicting parking difficulty with machine learning; TensorFlow 1.0 Release

In February on /r/MachineLearning deep learning course material was hot, a new library for visualizing Scikit-learn results made a splash, a new machine learning crash course installment from ML @ Berkeley was a hit, Google decided we need help parking, and TensorFlow 1.0 got a release alongside its developer summit.


The top 5 /r/MachineLearning posts of the past month are:

1. Oxford Deep NLP Course

Oxford offers an ongoing deep learning for natural language processsing course, helmed by Phil Blunsom of Oxford and DeepMind, and featuring a whole host of guest lecturers. The above links directly to the lectures repo of the course Github page. This is the link to course Github profile, which also includes other repos of interest, such as the course practical exercises.

2. Scikit-plot: A visualization library for scikit-learn object results

This links to the scikit-plot Github repo, by Reiichiro Nakano. Scikit-plot is described as:

An intuitive library to add plotting functionality to scikit-learn objects.

Scikit-plot is the result of an unartistic data scientist's dreadful realization that visualization is one of the most crucial components in the data science process, not just a mere afterthought.


Scikit-plot includes functionality for presenting ROC curves, elbow plots, silhouette graphs, feature importance visualizations, precision-recall curves, and more. It currently offers 2 different APIs, and works with non-Scikit-learn objects like Keras classifiers as well. This is definitely an interesting project which I have been making use of lately myself.

3. Machine Learning at Berkeley's ML Crash Course: Neural Networks and Backprop

Machine Learning @ Berkeley has been putting together a series of articles intended to be a crash course in machine learning. This latest effort focuses on neural networks and the "magic" of backpropagation, and provides a solid overview of the material.

This sums up exactly what the tutorial covers:

In this article we’ll go through how a neural network actually works, and in a future article we’ll discuss some of the limitations of these seemingly magical tools.

Neural networks

I like a neural networks discussion that expresses the idea of limitations, and extinguishes the myth of magic right up front. Realistic expectations are seemingly often overlooked in this area of study, especially material for beginners.

You may also be interested in part 1 and part 2 of the crash course as well.

4. Predicting parking difficulty with machine learning

The folks at Google Research have been good enough to share a helpful model which predicts how difficult a given parking task is, rolling it out as a Google Maps feature in 25 US cities.

Last week, we launched a new feature for Google Maps for Android across 25 US cities that offers predictions about parking difficulty close to your destination so you can plan accordingly. Providing this feature required addressing some significant challenges[.]

The post then elaborates on both the problems of building a useful model of such a task, and how it deals with said problems (spoiler alert: a combination of crowd sourcing and machine learning is employed, with seems to be right in Google's contemporary wheelhouse). I'm interested in seeing how the model performs, as well as how helpful the added functionality actually is, on a practical level (a separate measure entirely).

Next time I need to parallel park in one of these 25 cities, you can bet I read up on how to use this first, if only for sheer entertainment!

5. TensorFlow 1.0 Release

Google announced TensorFlow's 1.0 release this past month, with some cool updates, a new high-level API compatible with Keras, some changes to existing APIs (notably, to resemble NumPy more closely), experimental APIs for Go and Java, and more. Software used to be eating the world; anymore, it seems like TensorFlow, specifically, has taken over this task.


On a related note, TensorFlow Dev Summit 2017 coincided with the release of TensorFlow 1.0, and the recorded stream of the event can be viewed here.