Dive Into Deep Learning: The Free eBook
This freely available text on deep learning is fully interactive and incredibly thorough. Check out "Dive Into Deep Learning" now and increase your neural networks theoretical understanding and practical implementation skills.
What makes Dive into Deep Learning (D2K) unique is that we went so far with the idea of *learning by doing* that the entire book itself consists of runnable code. We tried to combine the best aspects of a textbook (clarity and math) with the best aspects of handson tutorials (practical skills, reference code, implementation tricks, and intuition). Each chapter section teaches a single key idea through multiple modalities, interweaving prose, math, and a selfcontained implementation that can easily be grabbed and modified to give your projects a running start. We think this approach is essential for teaching deep learning because so much of the core knowledge in deep learning is derived from experimentation (vs. first principles).
Zachary Lipton
Thanks to the current realities associated with COVID19, lots of us around the world are spending more time at home than we normally do, and some of us may have additional idle time on ours hands. For those of us looking to spend some of this idle time learning something new or reviewing something previously learned, we have been (and hope to continue) spotlighting a few select standout textbooks of interest in data science and related fields. This is the next entry in the series.
Once you have acquired the requisite mathematical foundations for machine learning, perhaps you are interested in turning your attention to neural networks and deep learning. There are many fine books available for someone looking to go this route, though few offerings tick the boxes of being freely available, up to date, and incredibly thorough. One such exemplar is Dive Into Deep Learning, by Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola, a book which rightly bills itself as "[a]n interactive deep learning book with code, math, and discussions, based on the NumPy interface."
This book is great for a number of reasons. First off, and perhaps most importantly, it delivers on the promise of being interactive. The book is written in Jupyter notebooks, and so the code in chapters can be executed to see immediate results, as well as finetuned for inquisitive comparison. There is flexibility in how to execute these notebooks:
 Download the entirety of the book in notebook form to read and execute locally;
 Execute them on AWS using Amazon SageMaker
 Launch Google Colab notebooks directly from corresponding chapters by clicking the "Colab" link in the online version of the book, as shown below)
Of course, if you just want to download a PDF to read like it's 2015, you can do that, too.
Another attribute of the book is that this second iteration has adopted a Numpy interface approach for its code examples. The benefit of this is an immediate sense of familiarity for those who have been dabbling in the Python ecosystem for any length of time. In a world where numerous deep learning frameworks have implemented their own API styles, its nice to see this text adopts the use of tools such as PyTorch and MXNet's Gluon and their Numpylike interface approach. This makes the transition more seamless for those coming from, and already understanding, the Python stack built on top of Numpy.
The book is also up to date, with a major revision having taken place within the past 2 weeks of this article's writing — a revamp of the NLP chapters, including the addition of sections on BERT and language inference. This means you aren't learning the best practices of 3 years ago (which is a very long time in the world of neural networks, at least in some respects), and claims of demonstrated cutting edge and SOTA really are what they purport to be here.
The full table of contents is as follows:
 Introduction
 Preliminaries
 Linear Neural Networks
 Multilayer Perceptrons
 Deep Learning Computation
 Convolutional Neural Networks
 Modern Convolutional Neural Networks
 Recurrent Neural Networks
 Modern Recurrent Neural Networks
 Attention Mechanisms
 Optimization Algorithms
 Computational Performance
 Computer Vision
 Natural Language Processing: Pretraining
 Natural Language Processing: Applications
 Recommender Systems
 Generative Adversarial Networks
 Appendix: Mathematics for Deep Learning
 Appendix: Tools for Deep Learning
Given the book was written by academics who seem to have written it with the focus of being used in an academic setting, it should not be a surprise that there would be a course developed by (at least) one of the authors which is built from accessible and complementary materials such as slides, videos, and the like.
For a sense of the elegant and effective prose you will find in the book, here's an excerpt taken from 14.8.1. From ContextIndependent to ContextSensitive:
For example, by taking the entire sequence as the input, ELMo is a function that assigns a representation to each word from the input sequence. Specifically, ELMo combines all the intermediate layer representations from pretrained bidirectional LSTM as the output representation. Then the ELMo representation will be added to a downstream task’s existing supervised model as additional features, such as by concatenating ELMo representation and the original representation (e.g., GloVe) of tokens in the existing model. On one hand, all the weights in the pretrained bidirectional LSTM model are frozen after ELMo representations are added. On the other hand, the existing supervised model is specifically customized for a given task. Leveraging different best models for different tasks at that time, adding ELMo improved the state of the art across six natural language processing tasks: sentiment analysis, natural language inference, semantic role labeling, coreference resolution, named entity recognition, and question answering.
Are you interested, but don't know if you should take my word for it? Here's what others have said about the book.
"In less than a decade, the AI revolution has swept from research labs to broad industries to every corner of our daily life. Dive into Deep Learning is an excellent text on deep learning and deserves attention from anyone who wants to learn why deep learning has ignited the AI revolution: the most powerful technology force of our time."
— Jensen Huang, Founder and CEO, NVIDIA"This is a timely, fascinating book, providing with not only a comprehensive overview of deep learning principles but also detailed algorithms with handson programming code, and moreover, a stateoftheart introduction to deep learning in computer vision and natural language processing. Dive into this book if you want to dive into deep learning!"
— Jiawei Han, Michael Aiken Chair Professor, University of Illinois at UrbanaChampaign"This is a highly welcome addition to the machine learning literature, with a focus on handson experience implemented via the integration of Jupyter notebooks. Students of deep learning should find this invaluable to become proficient in this field."
— Bernhard Schölkopf, Director, Max Planck Institute for Intelligent Systems
Dive Into Deep Learning is less a book on deep learning than it is a fully interactive experience on the topic. Whether you are starting out your neural networks journey or are looking to refine your understanding, Dive Into Deep Learning and its presentation format will undoubtedly be helpful.
Related:
 Mathematics for Machine Learning: The Free eBook
 Another 10 Free MustRead Books for Machine Learning and Data Science
 24 Best (and Free) Books To Understand Machine Learning
Top Stories Past 30 Days

