Top /r/MachineLearning Posts, September: Open Images Dataset; Whopping Deep Learning Grant; Advanced ML Courseware

Google Research announces the Open Images dataset; Canadian Government Deep Learning Research grant; DeepMind: WaveNet - A Generative Model for Raw Audio; Machine Learning in a Year - From total noob to using it at work; Phd-level machine learning courses; xkcd: Linear Regression

In September on /r/MachineLearning, Google Research releases a new image dataset, the Canadian Government dishes out an incredible deep learning grant, an improved generative model for raw audio, the chronicles of going from total noob to using machine learning at work in a year, PhD level graduate machine learning courseware for all, and a few xkcd laughs.

The top 6 /r/MachineLearning posts of the past month are:

1. xkcd: Linear Regression +602

We start off with an offering from xkcd:

Linear regression

Moving on...

2. Google Research announces the Open Images dataset +501

Open Images

Google Research has announced the release of the Open Images Dataset. Directly from the blog post:

Today, we introduce Open Images, a dataset consisting of ~9 million URLs to images that have been annotated with labels spanning over 6000 categories. We tried to make the dataset as practical as possible: the labels cover more real-life entities than the 1000 ImageNet classes, there are enough images to train a deep neural network from scratch and the images are listed as having a Creative Commons Attribution license*.

Training neural nets seems to have a gotten an organized boost from Google Research.

3. $93,562,000 awarded by Canadian Gov. for Deep Learning Research at University of Montreal +455

Directly from the Government of Canada's Canada First Research Excellence Fund competition results and awards announcement, outlining the massive stack of cash that the University of Montreal has come into:

Campus Montréal is proposing a transformative and far-reaching strategy that capitalizes on the unique and synergistic combination of machine learning / deep learning and operations research—the science of optimization. The strategy, which lies at the core of data-driven innovation, will pave the way to major scientific breakthroughs, allowing useful information to be efficiently extracted from massive data sets (machine learning) and turned into actionable decisions (operations).

It may be called "Canada First," but given the Canadian Deep Learning Mafia's contribution to neural networks and machine learning at large, both inside and outside of the country, this should be good news to everyone.

4. DeepMind: WaveNet - A Generative Model for Raw Audio +425

This is a link to a post introducing WaveNet, a deep generative model for producing raw audio waveforms. From the post:

We show that WaveNets are able to generate speech which mimics any human voice and which sounds more natural than the best existing Text-to-Speech systems, reducing the gap with human performance by over 50%.


This project improves upon the state of the art, and includes a solid overview and a handful of comparative audio samples which clearly demonstrate the improvements. A very worthwhile read.

5. Machine Learning in a Year - From total noob to using it at work +360

Occasional KDnuggets contributor and master explainer Per Harald Borgen has written another in his line of outstanding "learning machine learning" posts, with this one being an all-encompassing piece on how he went from knowing little to nothing about ML to employing useful methods at his place of business within a year. Both a helpful and inspiring read, to be sure.

6. Phd-level machine learning courses +316

This is a small collection of advanced machine learning topic courses, for the upper graduate level. It starts of with the following list (but check out the post for some other gems added by others and buried in the comments):

  1. Advanced Introduction to ML - videos
  2. Large Scale ML - videos
  3. Statistical Learning Theory and Applications - videos
  4. Regularization Methods for ML - videos
  5. Statistical ML - videos
  6. Convex Optimization - videos
  7. Probabilistic Graphical Models 2014 (with videos) - PGM 2016 (without videos)

Be warned, however: these aren't entry level courses; the consensus is that they are geared toward PhD students.