CuDNN – A new library for Deep Learning

Becoming more and more popular, deep learning is proved to be useful in artificial intelligence. Last week, NVIDIA’s new library for deep neural networks, cuDNN, has attracted much attention.

NVIDIA released a GPU-accelerated library of primitives for deep neural networks called cuDNN last week.  U.C. Berkeley researchers have integrated it into Caffe, and its ConvNet library is also with Torch 7 bindings brought by Facebook AI Research.


In our own benchmarking using cuDNN with a leading neural network package called CAFFE, we obtain more than a 10X speed-up when training the “reference Imagenet” DNN model on an NVIDIA Tesla K40 GPU, compared to an Intel IvyBridge CPU.

 Here are comments from Yann LeCun’s Facebook Page:

An increasingly large number of NVIDIA GPU cycles are being spent training and running convolutional nets. It is an awesome move on NVIDIA’s part to be offering direct support for convolutional nets.

The benchmark results from Soumith showed that, compared to major machine learning frameworks like Theano, Caffe, and cuds-convnet, CuDNN could work faster for a few certain configurations

With cuDNN, we’re doing the work of optimizing the low-level routines used in these deep learning systems (e.g., convolutions) so that the people developing those systems need not spend their time doing so. Instead, they can focus their attention on the higher-level machine learning questions and advance the state of the art, while relying on us to make their code run faster with GPU accelerators.

In 2014 ImageNet Challenge, over 90 percent of the teams were using GPUs training their deep learning work. There is no doubt that more cuDNN support will be available in the near further.