Up to Speed on Deep Learning: August Update, Part 2

This is the second part of an overview of deep learning stories that made news in August. Look to see if you have missed anything.

By Isaac Madan, Investor at Venrock & Co-editor of Requests for Startups.

Images header

Continuing our series of deep learning updates, we pulled together some of the awesome resources that have emerged since our last post on August 16th. In case you missed it, here’s the August update (part 1),here’s the July update (part 2), here’s the July update (part 1), here’s the June update, and here’s the original set of 20+ resources we outlined in April. As always, this list is not comprehensive, so let us know if there’s something we should add, or if you’re interested in discussing this area further.

Learning to segment

Learning to Segment by Piotr Dollar of Facebook. Piotr explains Facebook’s efforts and progress in image segmentation, as well as highlighting use cases and explaining the importance of such advancements. When humans look at an image, they can identify objects down to the last pixel. At Facebook AI Research (FAIR) we’re pushing machine vision to the next stage — our goal is to similarly understand images and objects at the pixel level.

Google Brain

Google Brain begins accepting applications to its Residency program on September 1st. Google’s Jeff Dean will deliver a Youtube livestream to describe the Google Brain team and the Residency program. The Google Brain Residency Program is a one-year intensive residency program focused on Deep Learning. Residents will have the opportunity to conduct cutting-edge research and work alongside some of the most distinguished deep learning scientists within the Google Brain team. To learn more about the team visit g.co/brainConsider applying here when applications open.

Text summarization

Text summarization with TensorFlow by Peter Liu of Google Brain. The Brain team open sources their TensorFlow model code for generating news headlines on a large dataset frequently used for summarization tasks. Peter explains two approaches — extractive and abstractive summarization, describes the model, and highlights areas of future interest.

Robot datasets

Google Brain robot datasets by Sergey Levine, Chelsea Finn, and Laura Dows. The Google Brain team releases massive robotics datasets from two of their recent papers to further drive the field forward. Their grasping datasetcontains roughly 650,000 examples of robot grasping attempts (original paper here). Their push dataset contains roughly 59,000 examples of robot pushing motions, including one training set (train) and two test sets of previously seen (testseen) and unseen (testnovel) objects (original paper here).

Self-driving cars

End-to-End Deep Learning for Self-Driving Cars by NVIDIA. The autonomous car team at NVIDIA describes their end-to-end approach to self-driving vehicles, using convolutional neural networks (CNNs) to map the raw pixels from a front-facing camera to the steering commands for a self-driving car. Original paper here.

NIPS papers

NIPS list of accepted papers. The 2016 Conference on Neural Information Processing Systems, cited as the top machine learning conference, takes place from December 5th through 10th in Barcelona, Spain. The list of accepted papers highlights some of the bleeding-edge machine learning & AI research that will be presented, as well as the researchers & practitioners driving the field forward who may be present. Consider attending this year — details here.

Predicting poverty

Combining satellite imagery and machine learning to predict poverty by Neal Jean et al of Stanford. Nighttime lighting is a rough proxy for economic wealth, and nighttime maps of the world show that many developing countries are sparsely illuminated. Jean et al. combined nighttime maps with high-resolution daytime satellite images. With a bit of machine-learning wizardry, the combined images can be converted into accurate estimates of household consumption and assets, both of which are hard to measure in poorer countries. Furthermore, the night- and day-time data are publicly available and nonproprietary.


Speech Is 3x Faster than Typing for English and Mandarin Text Entry on Mobile Devices by Sherry Ruan et al of Stanford & Baidu. Researchers evaluate Deep Speech 2, a deep learning-based speech recognition system, assessing that the system makes English text input 3.0X faster, and Mandarin Chinese input 2.8X faster than standard keyboard typing. The error rates were also dramatically reduced, and the results further highlight the potential & strength of speech interfaces.

Isaac Madan is an investor at Venrock, an early-stage venture capital firm with investments in companies like Dollar Shave Club, Cloudflare, AppNexus, Dataminr, and Pearl Automation, and previously Nest, Apple, Intel, etc. Isaac’s background is in machine learning & artificial intelligence, having been previously an entrepreneur and data scientist. He can reached via email at isaac@venrock.com.

Requests for Startups is a newsletter of entrepreneurial ideas & perspectives by investors, operators, and influencers. If you think there’s someone we should feature in an upcoming issue, nominate them by sending Isaac an email.

Original. Reposted with permission.