A Concise Course in Statistical Inference: The Free eBook
Check out this freely available book, All of Statistics: A Concise Course in Statistical Inference, and learn the probability and statistics needed for success in data science.
Another week, another free eBook being spotlighted here at KDnuggets.
This time we turn our attention to statistics, and the book All of Statistics: A Concise Course in Statistical Inference. Springer has made this book freely available in both PDF and EPUB forms, with no registration necessary; just go to the book's website and click one of the download links.
The book, written by Larry Wasserman, is meant to be an introduction to, and overview of, general statistics. From the book's website:
This book covers a much wider range of topics than a typical introductory text on mathematical statistics. It includes modern topics like nonparametric curve estimation, bootstrapping and classification, topics that are usually relegated to follow-up courses. The reader is assumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. The text can be used at the advanced undergraduate and graduate level.
And why do we need an understanding of basic probability and mathematical statistics? Aptly put, from the book's preface:
Students who analyze data, or who aspire to develop new methods for analyzing data, should be well grounded in basic probability and mathematical statistics. Using fancy tools like neural nets, boosting, and support vector machines without understanding basic statistics is like doing brain surgery before knowing how to use a band-aid.
Take this as fair warning that the book is maths-heavy (as such a book should be). There are many introductions to statistics which ease the reader in with intuitions and examples, but this text goes directly the heart of the matter. The book is also very attuned to the notion that statistics and machine learning are very much related and intertwined fields, making the book especially appropriate for anyone looking to apply their newly-learned statistical concepts to their practice of machine learning.
Statistics, data mining, and machine learning are all concerned with collecting and analyzing data. For some time, statistics research was conducted in statistics departments while data mining and machine learning research was conducted in computer science departments. Statisticians thought that computer scientists were reinventing the wheel. Computer scientists thought that statistical theory didn't apply to their problems.
Things are changing. Statisticians now recognize that computer scientists are making novel contributions while computer scientists now recognize the generality of statistical theory and methodology. Clever data mining algorithms are more scalable than statisticians ever thought possible. Formal statistical theory is more pervasive than computer scientists had realized.
The book's table of contents is as follows:
- Random Variables
- Convergence of Random Variables
- Statistical Inference
- Models, Statistical Inference and Learning
- Estimating the CDF and Statistical Functionals
- The Bootstrap
- Parametric Inference
- Hypothesis Testing and p-values
- Bayesian Inference
- Statistical Decision Theory
- Statistical Models and Methods
- Linear and Logistic Regression
- Multivariate Models
- Inference About Independence
- Causal Inference
- Directed Graphs and Conditional Independence
- Undirected Graphs
- Log-Linear Models
- Nonparametric Curve Estimation
- Smoothing Using Orthogonal Functions
- Probability Redux: Stochastic Processes
- Simulation Methods
The book includes many visualizations, though none are presented in color. This isn't the issue that I thought it might at first be, as the black and grey do the job well enough. They were able to adequately convey the Bart Simpson distribution:
- Dive Into Deep Learning: The Free eBook
- Mathematics for Machine Learning: The Free eBook
- Free Mathematics Courses for Data Science & Machine Learning