Platinum Blog10 More Free Must-Read Books for Machine Learning and Data Science

Summer, summer, summertime. Time to sit back and unwind. Or get your hands on some free machine learning and data science books and get your learn on. Check out this selection to get you started.



It's time for another collection of free machine learning and data science books to kick off your summer learning season. Because that's a thing. Right?

If, after reading this list, you find yourself wanting more free quality, curated books, check the previous iteration of this series or the related posts below.

Post header image

 
1. Python Data Science Handbook
By Jake VanderPlas

The book introduces the core libraries essential for working with data in Python: particularly IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and related packages. Familiarity with Python as a language is assumed; if you need a quick introduction to the language itself, see the free companion project, A Whirlwind Tour of Python: it's a fast-paced introduction to the Python language aimed at researchers and scientists.

 
2. Neural Networks and Deep Learning
By Michael Nielsen

Neural Networks and Deep Learning is a free online book. The book will teach you about:

  • Neural networks, a beautiful biologically-inspired programming paradigm which enables a computer to learn from observational data
  • Deep learning, a powerful set of techniques for learning in neural networks

Neural networks and deep learning currently provide the best solutions to many problems in image recognition, speech recognition, and natural language processing. This book will teach you many of the core concepts behind neural networks and deep learning.

 
3. Think Bayes
By Allen B. Downey

Think Bayes is an introduction to Bayesian statistics using computational methods.

The premise of this book, and the other books in the Think X series, is that if you know how to program, you can use that skill to learn other topics.

Most books on Bayesian statistics use mathematical notation and present ideas in terms of mathematical concepts like calculus. This book uses Python code instead of math, and discrete approximations instead of continuous mathematics. As a result, what would be an integral in a math book becomes a summation, and most operations on probability distributions are simple loops.

 
4. Machine Learning & Big Data
By Kareem Alkaseer

This is a work in progress, which I add to as time allows. The purpose behind it is to have a balance between theory and implementation for the software engineer to implement machine learning models comfortably without relying too much on libraries. Most of the time the concept behind a model or a technique is simple or intutive but it gets lost in details or jargon. Also, most of the time existing libraries would solve the problem at hand but they are treated as black boxes and more often than not they have their own abstractions and architectures that hide the underlying concepts. This book's attempt is to make the underlying concepts clear.

 
5. Statistical Learning with Sparsity: The Lasso and Generalizations
By Trevor Hastie, Robert Tibshirani, Martin Wainwright

During the past decade there has been an explosion in computation and information technology. With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. This book descibes the important ideas in these areas in a common conceptual framework.

 
6. Statistical inference for data science
By Brian Caffo

This book is written as a companion book to the Statistical Inference Coursera class as part of the Data Science Specialization. However, if you do not take the class, the book mostly stands on its own. A useful component of the book is a series of YouTube videos that comprise the Coursera class.

The book is intended to be a low cost introduction to the important field of statistical inference. The intended audience are students who are numerically and computationally literate, who would like to put those skills to use in Data Science or Statistics. The book is offered for free as a series of markdown documents on github and in more convenient forms (epub, mobi) on LeanPub and retail outlets.

 
7. Convex Optimization
By Stephen Boyd and Lieven Vandenberghe

This book is about convex optimization, a special class of mathematical optimization problems, which includes least-squares and linear programming problems. It is well known that least-squares and linear programming problems have a fairly complete theory, arise in a variety of applications, and can be solved numerically very efficiently. The basic point of this book is that the same can be said for the larger class of convex optimization problems.

 
8. Natural Language Processing with Python
By Steven Bird, Ewan Klein, and Edward Loper

This is a book about Natural Language Processing. By "natural language" we mean a language that is used for everyday communication by humans; languages like English, Hindi or Portuguese. In contrast to artificial languages such as programming languages and mathematical notations, natural languages have evolved as they pass from generation to generation, and are hard to pin down with explicit rules. We will take Natural Language Processing — or NLP for short — in a wide sense to cover any kind of computer manipulation of natural language.
...
The book is based on the Python programming language together with an open source library called the Natural Language Toolkit (NLTK).

 
9. Automate the Boring Stuff with Python
By Al Sweigart

If you've ever spent hours renaming files or updating hundreds of spreadsheet cells, you know how tedious tasks like these can be. But what if you could have your computer do them for you?

In Automate the Boring Stuff with Python, you'll learn how to use Python to write programs that do in minutes what would take you hours to do by hand-no prior programming experience required. Once you've mastered the basics of programming, you'll create Python programs that effortlessly perform useful and impressive feats of automation.

 
10. Social Media Mining: An Introduction
By Reza Zafarani, Mohammad Ali Abbasi and Huan Liu

The growth of social media over the last decade has revolutionized the way individuals interact and industries conduct business. Individuals produce data at an unprecedented rate by interacting, sharing, and consuming content through social media. Understanding and processing this new type of data to glean actionable patterns presents challenges and opportunities for interdisciplinary research, novel algorithms, and tool development. Social Media Mining integrates social media, social network analysis, and data mining to provide a convenient and coherent platform for students, practitioners, researchers, and project managers to understand the basics and potentials of social media mining.

 
Related: