KDnuggets Home » News » 2016 » Jan » Software » Top 10 Deep Learning Projects on Github ( 16:n02 )

Top 10 Deep Learning Projects on Github

The top 10 deep learning projects on Github include a number of libraries, frameworks, and education resources. Have a look at the tools others are using, and the resources they are learning from.

Open source tools are increasingly important in the data science workflow.

Recent KDnuggets software poll results indicate that 73% of data scientists used free data science tools within the previous 12 months. This is easily digestable, given that Python and R (both of which are open source), along with their respective ecosystems, are some of the most prominent and essential tools that data scientists wield.

Stars vs. Forks

Github has become the de facto open source software clearinghouse, hosting all imaginable types of projects. Given the growing adoption of deep learning in academia, research, and hobby, and its increasing role in data science, we are exploring the top deep learning projects available on Github.*

It should be noted that several rather prominent projects that most of us would consider to be "deep learning" projects do not appear on our list as they do not show up as results when searching "deep learning" on Github.

1. Caffe

★ 7905, Fork4482

Caffe Caffe is a deep learning library with Python and MATLAB bindings. Originating from the Berkeley Vision and Learning Center, UC Berkeley, one could be forgiven for believing Caffe is only for computer vision applications; it is, in fact, a general purpose deep learning library for deploying convolutional networks, as well as other architectures, in vision, speech, and other applications.

2. Data Science IPython Notebooks

★ 4386, Fork697

This is a collection of IPython notebooks curated by Donne Martin. Topics covered include: Big Data, Hadoop, scikit-learn and the scientific Python stack, as well as many others. Concerning deep learning, frameworks such as TensorFlow, Theano, and Caffe are covered, along with particular architectures and concepts.

3. ConvNetJS

★ 3924, Fork736


Written by Stanford PhD student Andrej Karpathy, who also maintians a very enlightening blog, ConvNetJS is a JavaScript implementation of neural networks and their common modules, and includes numerous browser-based examples. The documentation and examples are complete and numerous, respectively. Don't let the idea of JavaScript and neural nets together scare you off; this is a popular and useful project to be sure.

4. Keras

★ 3852, Fork896

Keras is a Python deep learning library which leverages both TensorFlow and Theano, meaning that it can be run on top of either of what are arguably 2 of the most popular deep learning research libraries currently in existence. It is one of a growing number of what could be described as very high level libraries, all of which function similarly: abstract the underlying deep learning engines for quicker, easier, and more flexible neural net implementations. Keras supports the major deep learning architectures, comes with a 30 second quick start guide, and has solid documentation.

5. MXNet

★ 3278, Fork737

As a deep learning framework, MXNet aims for both flexibility and efficiency, and allows the mixing of imperative and symbolic programming techniques to improve productivity. The project includes bindings for numerous languages, including Python, R, and Julia. MXNet also comes with an array of neural network guidelines and blueprints. Also of note, a related project implements MXNet in JavaScript for the browser, and an image classification model can be tested via this link.

6. Qix

★ 2253, Fork961

So... this is a repo of what appears to be resources related to a wide range of computing and programming topics, including Node.Js, GoLang, and deep learning. I say "appears" because the repo is written entirely in Chinese, and the translation provided by Google causes even more confusion. However, there are a number of links, so if you speak or can read Chinese, perhaps there is something of value hidden in here.

7. Deeplearning4j

★ 1824, Fork612


Deeplearning4j is an industrial-strength deep learning framework for Java and Scala. As one of the lone JVM deep learning solutions of note, Deeplearning4j has an obvious advantage in the space. Integrating with Hadoop and Spark, Deeplearning4j also has the ability to leverage GPUs. Its documentation and tutorials are also very solid.

8. Machine Learning Tutorials

★ 1759, Fork195

This is a curated list of machine learning and deep learning tutorials, articles, and resources. Organized by topic, the list contains numerous deep learning related categories, including computer vision, reinforcement learning, and various architectures. It has been making the rounds on social media for some months, given its extensive content, and you can contribute by looking here.

9. DeepLearnToolbox

★ 1651, Fork1202

This is a deep learning toolbox for MATLAB and Octave. However, the project is currently deprecated and no longer being maintained. The repo does point to Theano and TensorFlow as valuable alternatives for pursuing deep learning today.

Looking for a silver lining in this deprecated cloud, this link, a book by Yoshua Bengio, is included in the repo as recommended resource for learning deep learning architectures for AI.

Neural Network

10. LISA Lab Deep Learning Tutorials

★ 1555, Fork944

This repo is a collection of The University of Montreal's LISA Lab deep learning tutorials. Directly from the readme:

The tutorials presented here will introduce you to some of the most important deep learning algorithms and will also show you how to run them using Theano. Theano is a python library that makes writing deep learning models easy, and gives the option of training them on a GPU.

Here is a direct link to browsing the tutorials online.

* Determined by the top returned results to the query "deep learning" on Github search, sorted by most stars, as of January 10, 2015, 10:00PM EST.

Bio: Matthew Mayo is a computer science graduate student currently working on his thesis parallelizing machine learning algorithms. He is also a student of data mining, a data enthusiast, and an aspiring machine learning scientist.