Gold BlogAutomated Machine Learning: The Free eBook

There is a lot to learn about automated machine learning theory and practice. This free eBook can get you started the right way.

It's a new week, and what better time to get your hands on another free eBook? We have been highlighting a new such installment weekly for the better part of the past few months, doing our best to single out and share top learning materials for those stuck at home right now, or really for anyone interested in learning a new concept or brushing up on what they already know.



This week we turn our attention to the topic of automated machine learning (AutoML), a personal favorite of mine. What is automated machine learning? It is a wide (and widening) concept, but I've previously tried to capture its essence as such:

If, as Sebastian Raschka has described it, computer programming is about automation, and machine learning is "all about automating automation," then automated machine learning is "the automation of automating automation." Follow me, here: programming relieves us by managing rote tasks; machine learning allows computers to learn how to best perform these rote tasks; automated machine learning allows for computers to learn how to optimize the outcome of learning how to perform these rote actions.

This is a very powerful idea; while we previously have had to worry about tuning parameters and hyperparameters, automated machine learning systems can learn the best way to tune these for optimal outcomes by a number of different possible methods.

Which brings us to our book for the week. Automated Machine Learning: Methods, Systems, Challenges edited by Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren, is a collection of chapters which cover the hows and whys of contemporary automated machine learning, presents some of the available tools for the doing of AutoML, and discusses some of its challenges.

This book includes very up-to-date overviews of the bread-and-butter techniques we need in AutoML (hyperparameter optimization, meta-learning, and neural architecture search), provides in-depth discussions of existing AutoML systems, and thoroughly evaluates the state of the art in AutoML in a series of competitions that ran since 2015. As such, I highly recommend this book to any machine learning researcher wanting to get started in the field and to any practitioner looking to understand the methods behind all the AutoML tools out there.

The reasons for studying AutoML can be gleaned from the book's introduction:

As we show in this book, AutoML approaches are already mature enough to rival and sometimes even outperform human machine learning experts. Put simply, AutoML can lead to improved performance while saving substantial amounts of time and money, as machine learning experts are both hard to find and expensive. As a result, commercial interest in AutoML has grown dramatically in recent years, and several major tech companies are now developing their own AutoML systems.

The book's table of contents is as follows:

Part I: AutoML Methods

  1. Hyperparameter Optimization
  2. Meta-Learning
  3. Neural Architecture Search

Part II: AutoML Systems

  1. Auto-WEKA: Automatic Model Selection and Hyperparameter Optimization in WEKA
  2. Hyperopt-Sklearn
  3. Auto-sklearn: Efficient and Robust Automated Machine Learning
  4. Towards Automatically-Tuned Deep Neural Networks
  5. TPOT: A Tree-Based Pipeline Optimization Tool for Automating Machine Learning
  6. The Automatic Statistician

Part III: AutoML Challenges

  1. Analysis of the AutoML Challenge Series 2015–2018
Model performance and selection


If you have little to no understanding of what automated machine learning is in practice, don't worry. The book starts off with a solid introduction to the topic, and lays out explicitly what you can expect chapter by chapter, which is important in a book comprised of independent separate chapters. After this, in first section of the book, you get right in to reading about the important topics of contemporary AutoML, and be confident of this since the book was put together in 2019. Hyperparameter optimization is undoubtedly the bread and butter of automated machine learning techniques, so you set off with that first. This is followed up with the broader topic of meta-learning, or the observance of how machine learning approaches comparatively perform on varying learning tasks. Lastly, neural architecture search is tackled, which is the practice of automatically identifying the optimal neural network architecture construction for a given task.

Learning about these topics is great, but how about rolling up your sleeves and getting your hands dirty? The next section is a walkthrough of a half dozen tools for implementing these AutoML concepts. There are some Python libraries and a standalone GUI-based application to have a look at here. I would suggest choosing a single library and diving deep on it. There will be trade-offs between approaches, but the idea that anyone should learn, for example, 4 or 5 AutoML libraries right away is not helpful.

The last section is an analysis of the AutoML Challenge Series that existed for a few years during 2015 to 2018, the time that interest in automated approaches to machine learning seemed to explode. The insights learned from overseeing a competition on implementing fully automated machine learning systems definitely hold value for a practitioner, and so this chapter should not be skipped.

There is a lot to learn about automated machine learning theory and practice. This book can set you off on the right path, and I recommend it to anyone looking for such a book.