# 7 Books to Grasp Mathematical Foundations of Data Science and Machine Learning

It is vital to have a good understanding of the mathematical foundations to be proficient with data science. With that in mind, here are seven books that can help.

Most people learn Data Science with an emphasis on Programming. However, to be truly proficient with Data Science (and Machine Learning), you cannot ignore the mathematical foundations behind Data Science. In this post, I present seven books that I enjoyed in learning the mathematical foundations of Data Science.  ‘Enjoy’ is perhaps not the best of words since this effort is hard going!

So, why should you undertake the efforts of learning the Maths foundations of Data Science?

Here are some reasons which motivated me:

AI is rapidly changing. Geoffrey Hinton already believes we should rethink backpropagation. Understanding the Maths will help you understand the evolution of AI better. It will help you distinguish from others who approach AI from a superficial level. It will also help you to see the Intellectual Property(IP) potential of AI better. Finally, understanding the Maths behind Data Science could also lead you to the higher end jobs in AI and Data Science.

I have two additional motivations for working with these books.

1. First, as part of my teaching Data science for Internet of Things course at Oxford University and also with my personal teaching on AI applications I have included the maths based approach.
2. Second, I am writing a book to simplify AI from a Maths perspective for 14 to 18 year olds. To understand the foundations of Maths for Data Science and AI, you need to know four things i.e. Linear Algebra, Probability Theory, Multivariate Calculus, and Optimization. Most of these are taught (at least partially) in high schools. I am thus trying to relate high school maths to AI and Data Science with an emphasis on Mathematical modelling. Comments welcome on this approach.

So, here is the list of books with my comments:

1. The Nature Of Statistical Learning Theory

You cannot create a list about Maths books and not include the great Russian mathematicians! So, the first in my list is The Nature of Statistical Learning Theory by Vladimir Vapnik. Of all the books in this list, Vapnik is the hardest to find. I have an older Indian edition. Vladimir Vapnik is the creator of SVM. His Wikipedia page gives a lot more about his work.

2. Pattern Classification by Richard O Duda (2007-12-24)
By Richard O Duda

Like Dr Vapnik’s book, Duda is another classic from another era. First published in 1973. Updated 25 years later (2000) and nothing since! But yet a vital resource. The book takes a pattern recognition approach and provides extensive coverage of algorithms.

Stephen Marsland’s book is now in its second edition. Marsland was one of the earliest books I have read (I only have the first edition). Both are very good. The second edition I believe has lot more code in Python. Like the first two books, this book also places a heavy emphasis on Algorithms.

4. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition
By Trevor Hastie, Robert Tibshirani, Jerome Friedman

Hastie is another classic. The version I have is very well printed with colours. This is another reference book.

5. Pattern Recognition and Machine Learning (Information Science and Statistics)
By Christopher M. Bishop

Pattern Recognition and Machine Learning (Information Science and Statistics) by Christopher M. Bishop is also an in-depth and well-presented reference book.

I like Peter Flach’s book although some Amazon comments call it wordy and point out the lack of code. I like Flach especially for the grouping of algorithms (Logical models, Linear models, Probabilistic models) and the overall treatment of the themes.

Finally, my most recommended book:

7. Deep Learning
By Goodfellow, Bengio and Corville

If there is one book you should read end to end – it’s this one. Both detailed but also modern covering everything you can think of.