Update: Google TensorFlow Deep Learning Is Improving

The recent open sourcing of Google's TensorFlow was a significant event for machine learning. While the original release was lacking in some ways, development continues and improvements are already being made.

On November 9, 2015, Google released TensorFlow, its "open source software library for machine intelligence," of which I have previously shared my first impressions. On December 7, 2015, Jeff Dean and Oriol Vinyals presented a NIPS tutorial titled "Large-Scale Distributed Systems for Training Neural Networks." The talk is not solely about TensorFlow, but (logically) appears to be a main topic covered.


Dean and Vinyals covered distributed deep learning at Google, and TensorFlow's place in this context. While the ongoing Neural Information Processing Systems (NIPS) 2015 videos are not currently available online, a recent Quora answer by Cloudera founder and chief scientist Jeff Hammerbacher summarized the latest progress in Google TensorFlow, based on the tutorial by Dean and Vinyals.

The main takeaways from the talk, according to Hammerbacher, are outlined below. Each of these points are elaborated upon with some further comments and editorial.

1. Single-node performance of TensorFlow is improving

Using this respected deep convolutional network benchmarking repo, and comparing to these earlier TensorFlow benchmark results, Dean and Vinyals claim a full forward and backward propagation improvement of approximately 30% on the single node implementation, the only currently-available version of TensorFlow as of this writing.

Along with the lack of distributed processing capabilities of the original release of TensorFlow, its lower gear performance in comparison to its contemporaries was one of my biggest concerns, which was echoed by others. That Google is making good on at least one of these issues (while undoubtedly working toward implementing the other) is confidence-building, and starts removing any serious concerns that technologists had about TensorFlow from the get-go.

I will take the opportunity at this point to reiterate the necessity of releasing a distributed version of TensorFlow should Google hope to achieve a long-lasting share of the market. However, given its obvious implementation of such technology in its in-house version, and the timely improvements in speed that Dean and Vinyals have exhibited in the current single-node version, I would be willing to wager that an open source distributed TensorFlow is a reality by summer.

The opening act has created a buzz. I'm ready for the headliner.

2. Use of Deep Learning within Google is growing exponentially

Growing Use of TensorFlow
Exponential deep learning growth within Google.

DistBelief, Google Brain's first generation deep learning research project, started in 2011 under the guidance of Dean, Corrado, Ng, and a number of other high profile personalities. As recently as Q3 2013, there were comparatively few deep learning projects at Google, estimated from the above graph to be in the double digits at said point in time.

Growth has since exploded, and continues exponentially, with an approximate 1200% increase between Q3 2013 and Q3 2015. This is indicative of industry trends (which does not, on its own, make the trend any less noteworthy or impressive). Google is clearly and aggressively harnessing deep learning as we move into the future.

With continued current trends, it is interesting to consider the number of deep learning projects that Google will be incubating by, say, Q3 2017. The numbers will undoubtedly grow, but at what rate? Saturation will occur at some point, after which deep learning will become the dominant machine learning technology within Google, with these trends echoed across industry some number years afterward. Accurately predicting the expansion rate, both inside Google and beyond, will become a hot topic of discussion, as well as a much anticipated, as well as feared, occurrence.