Top 10 Amazon Books in Databases & Big Data, 2016 Edition

Given the ongoing explosion in interest for all things Data Science, Artificial Intelligence, Machine Learning, etc., we have updated our Amazon top books lists from last year. Here are the 10 most popular titles in the Databases & Big Data category.

The recent explosion of interest in data science, data mining, and related disciplines has been mirrored by an explosion in book titles on these same topics. One of the best ways to decide which books could be useful for your career is to look at which books others are reading. This post details the 10 most popular titles in Amazon's Databases & Big Data Books category as of Dec 12, 2016, skipping over repeated titles as well as titles which have been obviously miscategorized and are of no use to our readers.

Keep in mind that there is some natural overlap between this and previous categorical lists. While I have stated above that I take some editorial license in excluding miscategorized items, any attempt to separate intricately related disciplines from one another would prove futile (and tormenting). Thus, while you may find some items here which are more, for example, machine learning than database and/or big data, I would suggest you consider them holistically as opposed to allowing it to ruin your day :)

Note: KDnuggets gets absolutely no royalties from Amazon - this list is presented only to help our readers evaluate interesting books.

Amazon Top Database/Big Data Books

1. SQL: The Ultimate Guide From Beginner To Expert - Learn And Master SQL In No Time! (2016 Edition)
Peter Adams
4.5 out of 5 stars (133 reviews)
Kindle, $0.75

With this quick guide, Peter Adams will take you from beginner to expert in a flash. Each chapter covers all the key concepts you need to master to become a database ninja. You’ll work through all the fundamentals and get detailed examples, but the book doesn’t stop there. You also get an entire chapter of expert recommendations on working in the tech industry from staying healthy and organized to fixing or building a computer.

2. Data-ism: The Revolution Transforming Decision Making, Consumer Behavior, and Almost Everything Else
Steve Lohr
4.2 out of 5 stars (22 reviews)
Kindle, $1.73

In Data-ism, New York Times reporter Steve Lohr explains how big-data technology is ushering in a revolution in proportions that promise to be the basis of the next wave of efficiency and innovation across the economy. But more is at work here than technology. Big data is also the vehicle for a point of view, or philosophy, about how decisions will be—and perhaps should be—made in the future. Lohr investigates the benefits of data while also examining its dark side.

3. Python Machine Learning
Sebastian Raschka
4.3 out of 5 stars (83 reviews)
Paperback, $40.49 (Kindle, $22.39)

  • Leverage Python's most powerful open-source libraries for deep learning, data wrangling, and data visualization
  • Learn effective strategies and best practices to improve and optimize machine learning systems and algorithms
  • Ask – and answer – tough questions of your data with robust statistical models, built for a range of datasets

4. Data Science from Scratch: First Principles with Python 1st Edition
Joel Grus
4.2 out of 5 stars (68 reviews)
Paperback, $32.04

If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out.

5. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition
Trevor Hastie, Robert Tibshirani, Jerome Friedman
4.1 out of 5 stars (80 reviews)
Hardcover, $74.85

This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book.

6. SQL in 10 Minutes, Sams Teach Yourself (4th Edition)
Ben Forta
4.6 out of 5 stars (456 reviews)
Paperback, $25.32

Expert trainer and popular author Ben Forta teaches you just the parts of SQL you need to know–starting with simple data retrieval and quickly going on to more complex topics including the use of joins, subqueries, stored procedures, cursors, triggers, and table constraints.

You'll learn methodically, systematically, and simply–in 22 short, quick lessons that will each take only 10 minutes or less to complete.

7. Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking 1st Edition
Foster Provost, Tom Fawcett
4.6 out of 5 stars (154 reviews)
Paperback, $33.79

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today.

8. Machine Learning With Random Forests And Decision Trees: A Mostly Intuitive Guide, But Also Some Python
Scott Hartshorn
4.6 out of 5 stars (20 reviews)
Kindle, $2.27

This book is focused on understanding Random Forests at the conceptual level. Knowing how they work, why they work the way that they do, and what options are available to improve results. This book covers how Random Forests work in an intuitive way, and also explains the equations behind many of the functions, but it only has a small amount of actual code (in python).

9. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython
Wes McKinney
4.2 out of 5 stars (138 reviews)
Paperback, $31.99

Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you’ll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.

10. SQL Queries for Mere Mortals: A Hands-On Guide to Data Manipulation in SQL (3rd Edition)
John L. Viescas, Michael J. Hernandez
4.4 out of 5 stars (79 reviews)
Paperback, $36.64

Step by step, John L. Viescas and Michael J. Hernandez guide you through creating reliable queries for virtually any modern SQL-based database. They demystify all aspects of SQL query writing, from simple data selection and filtering to joining multiple tables and modifying sets of data.

Three brand-new chapters teach you how to solve a wide range of challenging SQL problems. You’ll learn how to write queries that apply multiple complex conditions on one table, perform sophisticated logical evaluations, and think “outside the box” using unlinked tables.