Weekend Reading List: Free eBooks and Other Online Resources

Do you have free time for reading this weekend? Here are a few new (or refreshed) selections of varying length for your leisure, along with a pair of papers, one cutting edge, and one classic.

The weekend is here. Time to get away from it all, enjoy our families, friends, and free time... and read up on the latest in data science, machine learning, and analytics. Right?

For those of us who can't completely disconnect, or are otherwise interested in reading up over the weekend, the following is a roundup of some of the best free recent ebooks and other online reading resources, as well as a classic throwback article worthy of the attention of newcomers to the field of machine learning.

Deep Learning

Deep Learning

As reported earlier this week, the MIT Press Deep Learning book is finished, and the online version has been finalized. Written by deep learning heavyweights Ian Goodfellow, Yoshua Bengio, and Aaron Courville, the book is poised to become the deep learning book on the market. At over 700 pages, and being quite technical in content, this isn't a simple one-weekend read (at least, not for the majority of folks), but getting started this weekend means only a few more needed.

As a complement to the book, check out these (unofficial) crib notes.

Recommender Systems

Recommender systems

This free resource, a dedicated ebook on recommender systems, has been making the rounds this week. Written by Subhrajit Roy, machine learning scientist and educator, the book is split into 6 chapters in 37 pages, and provides an introductory, high-level overview of recommender systems for the uninitiated. The book has been getting some positive feedback online, with comments supporting its introductory viewpoint.

For those interested, a complement to this ebook would be an introductory tutorial on recommender system implementation, something like this tutorial here, which does so in Python.

10 Signs of Data Science Maturity

From O'Reilly, this downloadable ebook report, written by Peter Guerra and Kirk Borne (both of Booz Allen Hamilton), is a quick read at a dozen pages, but packs a wealth of info on data science goals vis-a-vis lessons learned from years in the field. The book outlines ten characteristics of a mature data science capability, which the book's website claims encourage you to:

    10 Signs of Data Maturity

  • Give members of your organization access to all your available data
  • Use Agile and leverage "DataOps"—DevOps for data product development
  • Help your data science team sharpen its skills through open or internal competitions
  • Personify data science as a way of doing things, and not a thing to do

Deep Networks with Stochastic Depth

This relatively new deep learning research paper details an innovative method for training deep neural networks. Challenges associated with training deep networks: vanishing gradients, diminishing forward flow, and slow training times. The proposed solution: stochastic depth. The authors describe this methods as:

[A] training procedure that enables the seemingly contradictory setup to train short networks and obtain deep networks. We start with very deep networks but during training, for each mini-batch, randomly drop a subset of layers and bypass them with the identity function. The resulting networks are short (in expectation) during training and deep during testing.

This paper was recently written about by Delip Rao; if you don't have the time to read the original paper, check out his post for some quick insight.

BONUS: A Few Useful Things to Know About Machine Learning

This now-classic paper by Pedro Domingos (2012) recently re-appeared on /r/MachineLearning, generating some renewed interest and exposure. I read this a number of years ago when beginning the journey to data-mining-slash-machine-learning-focused graduate studies, and it was influential in my understanding of exactly what it was I was getting into at the time. Skimming it again suggests that the paper is still relevant for newcomers, as does the interest on Reddit. If you're new to machine learning, there are many places you could choose as your very first step; Domingos' paper is just as good as any of them.

Hopefully there is something here of interest for some weekend reading. Remember, though, that the same material will still be here on Monday...