Top arXiv Papers, January: ConvNets Advances, Wide Instead of Deep, Adversarial Networks Win, Learning to Reinforcement Learn
Check out the top arXiv Papers from January, covering convolutional neural network advances, why wide may trump deep, generative adversarial networks, learning to reinforcement learn, and more.
arXiv.org has become the leading clearinghouse for openaccess bleeding edge machine learning research, especially that on neural networks. Keeping up with the shared research in real time is impossible, given the relative information deluge. Hopefully this post can help convey some of the top arXiv papers from January.
The choice of "top" is somewhat subjective; I have used Andrej Karpathy's Arxiv Sanity Preserver to select form among the top papers of the past month (as queried on the evening of January 31)  "top" means being included in the most users' libraries  and then hand picked the seemingly most interesting, at least in my view. This isn't completely datadriven, but given that some papers would have had a full month to be "upvoted," while others could have come on the last day of the month, I feel confidently justified (enough) in the process and selections. If you don't like the method, feel free to check out the top returns yourself.
Along with the title and authors for each paper, you will also find some modest commentary, links, perhaps an image, and an excerpt from the abstract.
Recent Advances in Convolutional Neural Networks
Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Gang Wang
Convolutional neural networks (CNNs) are, without doubt, among the most researched and implemented contemporary neural network architectures. They have proven their mettle over the past several years, and have all but revolutionized entire areas of machine learning. If you need a crash course on CNNs, this paper may be a good uptodate starting place.
After the rapid growth in the amount of the annotated data and the recent improvements in the strengths of graphics processor units (GPUs), the research on convolutional neural networks has been emerged swiftly and achieved stateoftheart results on various tasks. In this paper, we provide a broad survey of the recent advances in convolutional neural networks. Besides, we also introduce some applications of convolutional neural networks in computer vision.
Wide Residual Networks
Sergey Zagoruyko, Nikos Komodakis
This paper discusses wide architectures as a counter to the lengthy training times of the very deep networks generally required for continued accuracy improvements. The use of ResNet blocks in an architecture of decreased network depth and increased width are discussed, in order to combat training times while preserving these accuracy improvements  Wide Residual Networks. The results are promising.
It should be noted that this is an update to a paper originally posted May 2016, which is why you've (probably) heard of this already. But if you haven't, read up. Here's an old Reddit discussion on the paper.
We call the resulting network structures wide residual networks (WRNs) and show that these are far superior over their commonly used thin and very deep counterparts. For example, we demonstrate that even a simple 16layerdeep wide residual network outperforms in accuracy and efficiency all previous deep residual networks, including thousandlayerdeep networks, achieving new stateoftheart results on CIFAR, SVHN, COCO, and significant improvements on ImageNet. Our code and models are available at this https URL
Adversarial Feature Learning
Jeff Donahue, Philipp Krähenbühl, Trevor Darrell
Generative Adversarial Networks (GANs) are a "hot topic" in machine learning. GANs do a great job mapping simple latent data distributions to more complex distributions, which can be beneficial in a variety of uses.
However, in their existing form, GANs have no means of learning the inverse mapping  projecting data back into the latent space. We propose Bidirectional Generative Adversarial Networks (BiGANs) as a means of learning this inverse mapping, and demonstrate that the resulting learned feature representation is useful for auxiliary supervised discrimination tasks, competitive with contemporary approaches to unsupervised and selfsupervised feature learning.
This is the most recent version of the paper.
NIPS 2016 Tutorial: Generative Adversarial Networks
Ian Goodfellow
As per the title, this is Ian Goodfellow's NIPS 2016 tutorial on Generative Adversarial Networks (GANs). You may know Ian from pioneering GANs. And so you would assume this is a solid overview of the technology. And it is. In fact, this is canonical reading for the generative network enthusiast.
This report summarizes the tutorial presented by the author at NIPS 2016 on generative adversarial networks (GANs). The tutorial describes: (1) Why generative modeling is a topic worth studying, (2) how generative models work, and how GANs compare to other generative models, (3) the details of how GANs work, (4) research frontiers in GANs, and (5) stateoftheart image models that combine GANs with other methods. Finally, the tutorial contains three exercises for readers to complete, and the solutions to these exercises.
Learning to reinforcement learn
Jane X Wang, Zeb KurthNelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick
Reinforcement learning (RL) systems have had smashing successes of late. This paper discusses deep metareinforcement learning, which is intended to combat the massive amounts of training data which RL systems generally require.
Previous work has shown that recurrent networks can support metalearning in a fully supervised context. We extend this approach to the RL setting. What emerges is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure. This second, learned RL algorithm can differ from the original one in arbitrary ways. Importantly, because it is learned, it is configured to exploit structure in the training domain. We unpack these points in a series of seven proofofconcept experiments, each of which examines a key aspect of deep metaRL. We consider prospects for extending and scaling up the approach, and also point out some potentially important implications for neuroscience.
A modest Hacker News discussion which makes a few good points on the original version of this paper from late 2016.
Why and When Can Deep  but Not Shallow  Networks Avoid the Curse of Dimensionality: a Review
Tomaso Poggio, Hrushikesh Mhaskar, Lorenzo Rosasco, Brando Miranda, Qianli Liao
This paper is an overview of neural networks, deep versus shallow architectures, and how the curse of dimensionality fits. Not exactly groundbreaking, but good review and tutorial material nonetheless.
The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning. Deep convolutional networks are a special case of these conditions, though weight sharing is not the main reason for their exponential advantage.
Benchmarking StateoftheArt Deep Learning Software Tools
Shaohuai Shi, Qiang Wang, Pengfei Xu, Xiaowen Chu
This technical paper benchmarks a number of deep learning frameworks employing a variety of network architectures, just as the title suggests. This is the latest version of the paper.
In this paper, we aim to make a comparative study of the stateoftheart GPUaccelerated deep learning software tools, including Caffe, CNTK, MXNet, TensorFlow, and Torch. We first benchmark the running performance of these tools with three popular types of neural networks on two CPU platforms and three GPU platforms. We then benchmark some distributed versions on multiple GPUs. Our contribution is twofold. First, for end users of deep learning tools, our benchmarking results can serve as a guide to selecting appropriate hardware platforms and software tools. Second, for software developers of deep learning tools, our indepth analysis points out possible future directions to further optimize the running performance.
Related:
Top Stories Past 30 Days

