20+ hottest research papers on Computer Vision, Machine Learning

December's ICCV 2015 conference in Santiago, Chile has come and gone, but that's no reason not to know about its top papers. Get an update on which computer vision papers and researchers won awards.

11. Conditional Random Fields as Recurrent Neural Networks Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, Philip H. S. Torr

We formulate mean-field approximate inference for the Conditional Random Fields with Gaussian pairwise potentials as Recurrent Neural Networks.

12. Flowing ConvNets for Human Pose Estimation in Videos Tomas Pfister, James Charles, Andrew Zisserman

We investigate a ConvNet architecture that is able to benefit from temporal context by combining information across the multiple frames using optical flow.

13. Dense Optical Flow Prediction From a Static Image Jacob Walker, Abhinav Gupta, Martial Hebert

Given a static image, P-CNN predicts the future motion of each and every pixel in the image in terms of optical flow. Our P-CNN model leverages the data in tens of thousands of realistic videos to train our model. Our method relies on absolutely no human labeling and is able to predict motion based on the context of the scene.

Deepbox results

14. DeepBox: Learning Objectness With Convolutional Networks Weicheng Kuo, Bharath Hariharan, Jitendra Malik

Our framework, which we call DeepBox, uses convolutional neural networks (CNNs) to rerank proposals from a bottom-up method.

15. Active Object Localization With Deep Reinforcement Learning Juan C. Caicedo, Svetlana Lazebnik

This agent learns to deform a bounding box using simple transformation actions, with the goal of determining the most specific location of target objects following top-down reasoning.

16. Predicting Depth, Surface Normals and Semantic Labels With a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus

We address three different computer vision tasks using a single multiscale convolutional network architecture: depth prediction, surface normal estimation, and semantic labeling.

17. HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition Zhicheng Yan, Hao Zhang, Robinson Piramuthu, Vignesh Jagadeesh, Dennis DeCoste, Wei Di, Yizhou Yu

We introduce hierarchical deep CNNs (HD-CNNs) by embedding deep CNNs into a category hierarchy. An HD-CNN separates easy classes using a coarse category classifier while distinguishing difficult classes using fine category classifiers.

FlowNet results

18. FlowNet: Learning Optical Flow With Convolutional Networks Alexey Dosovitskiy, Philipp Fischer, Eddy Ilg, Philip Häusser, Caner Hazırbaş, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, Thomas Brox

We construct appropriate CNNs which are capable of solving the optical flow estimation problem as a supervised learning task.

19. Understanding Deep Features With Computer-Generated Imagery Mathieu Aubry, Bryan C. Russell

Rendered images are presented to a trained CNN and responses for different layers are studied with respect to the input scene factors.

20. PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization Alex Kendall, Matthew Grimes, Roberto Cipolla

Our system trains a convolutional neural network to regress the 6-DOF camera pose from a single RGB image in an end-to-end manner with no need of additional engineering or graph optimisation.

21. Visual Tracking With Fully Convolutional Networks Lijun Wang, Wanli Ouyang, Xiaogang Wang, Huchuan Lu

A new approach for general object tracking with fully convolutional neural network.


While some can argue that the great convergence upon ConvNets is making the field less diverse, it is actually making the techniques easier to comprehend. It is easier to "borrow breakthrough thinking" from one research direction when the core computations are cast in the language of ConvNets. Using ConvNets, properly trained (and motivated!) 21 year old graduate student are actually able to compete on benchmarks, where previously it would take an entire 6-year PhD cycle to compete on a non-trivial benchmark.

See you next week in Chile!

Update (January 13th, 2016)

The following awards were given at ICCV 2015.

Achievement awards:

  • PAMI Distinguished Researcher Award (1): Yann LeCun
  • PAMI Distinguished Researcher Award (2): David Lowe
  • PAMI Everingham Prize Winner (1): Andrea Vedaldi for VLFeat
  • PAMI Everingham Prize Winner (2): Daniel Scharstein and Rick Szeliski for the Middlebury Datasets
Paper awards:

  • PAMI Helmholtz Prize (1): David Martin, Charles Fowlkes, Doron Tal, and Jitendra Malik for their ICCV 2001 paper "A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics".
  • PAMI Helmholtz Prize (2): Serge Belongie, Jitendra Malik, and Jan Puzicha, for their ICCV 2001 paper "Matching Shapes".
  • Marr Prize: Peter Kontschieder, Madalina Fiterau, Antonio Criminisi, and Samual Rota Bulo, for "Deep Neural Decision Forests".
  • Marr Prize honorable mention: Saining Xie and Zhuowen Tu for "Holistically-Nested Edge Detection".
For more information about awards, see Sebastian Nowozin's ICCV-day-2 blog post.

I also wrote another ICCV-related blog post (January 13, 2016) about the Future of Real-Time SLAM.

Bio: Tomasz Malisiewicz is an Entrepreneur, Scientist, and the Co-Founder of vision.ai. Previously he was a Postdoctoral Scholar at MIT's Computer Science and Artificial Intelligence Laboratory, obtained a PhD in Robotics from Carnegie Mellon University, and studied Physics/CS as an undergrad.

Original. Reposted with permission.