DeepMind’s Three Pillars for Building Robust Machine Learning Systems

Specification Testing, Robust Training and Formal Verification are three elements that the AI powerhouse believe hold the essence of robust machine learning models.


I recently started a new newsletter focus on AI education. TheSequence is a no-BS( meaning no hype, no news etc) AI-focused newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Please give it a try by subscribing below:



Building machine learning systems differs from traditional software development in many aspects of its lifecycle. Established software methodologies for testing, debugging and troubleshooting result simply impractical when applied to machine learning models. While the behavior of traditional software components like websites, mobile apps or APIs is exclusively dictated by its code, machine learning models evolve their knowledge over time depending on specific datasets. How to define and write robust machine learning agents is one of the existential challenges for the entire space. Last year, artificial intelligence(AI) researchers from DeepMind published some ideas about that topic.

When we think about writing robust software, we immediately relate to two code that behaves according to a predefined set of specifications. In the case of machine learning there is no established definition of correct specifications or robust behavior. The accepted practice is to train a machine learning model using a specific dataset and test it using a different dataset. That approach is incredibly efficient achieving above average behaviors in both datasets but is not always efficient when comes to edge cases. A classic example of these challenges are seeing in image classification models that can be completely disrupted by introducing small variations in the input dataset that are completely imperceptible to the human eye.

The notion of robustness in machine learning model should go beyond performing well against training and testing datasets but should also behave according to a predefined set of specifications that describe a desirable behavior of the system. Using our previous example, a requirement specification might detail the expected behavior of a machine learning model against adversarial perturbations or a given set of safety constraints.
Writing robust machine learning programs is a combination of many aspects ranging from accurate training dataset to efficient optimization techniques. However, most of these processes can be model as a variation of three main pillars that constitute the core focus on DeepMind’s research:

Image for post

  1. Testing Consistency with Specifications: Techniques to test that machine learning systems are consistent with properties (such as invariance or robustness) desired by the designer and users of the system.
  2. Training Machine Learning models to be Specification-Consistent: Even with copious training data, standard machine learning algorithms can produce predictive models that make predictions inconsistent with desirable specifications like robustness or fairness — this requires us to reconsider training algorithms that produce models that not only fit training data well, but also are consistent with a list of specifications.
  3. Formally Proving that Machine Learning Models are Specification-Consistent: There is a need for algorithms that can verify that the model predictions are provably consistent with a specification of interest for all possible inputs. While the field of formal verification has studied such algorithms for several decades, these approaches do not easily scale to modern deep learning systems despite impressive progress.


Specification Testing

Adversarial examples are a great mechanism to test the behavior of machine learning models against a given set of specifications. Unfortunately, most of the relevant work in adversarial training has been constrained to image classification models. Expanding some of those ideas into more generic areas such as reinforcement learning could provide a general-purpose mechanism to test the robustness of machine learning models.

Following some of the ideas of adversarial training, DeepMind developed two complementary approaches for adversarial testing of RL agents. The first technique uses a derivative-free optimization to directly minimize the expected reward of an agent. The second method learns an adversarial value function which predicts from experience which situations are most likely to cause failures for the agent. The learned function is then used for optimization to focus the evaluation on the most problematic inputs. These approaches form only a small part of a rich, growing space of potential algorithms, and we are excited about future development in rigorous evaluation of agents.


The adversarial approaches showed tangible improvements over traditional testing methods in reinforcement learning agents. Adversarial testing uncovered errors that typically go unnoticed while also surfaced qualitatively behavior in the agents that was not expected based on the composition of the training dataset. For instance, the following figure shows the effect of adversarial training in a 3D navigation task., Even though the agent can achieve human level performance, adversarial training shows that it can still fail in super simple tasks.


Specification Training

Adversarial testing is incredibly effective detecting errors but still fails to uncover examples that deviate from a given specification. If we think about the concept of requirements from a machine learning standpoint, they can be modeled as a mathematical relationship between inputs and outputs. Using that idea, the DeepMind team created a method that geometrically calculates the consistency of a model with a given specification by using lower and upper bounds. Known as Interval Bound Propagation, DeepMind’s method maps a specification to a bounded box that can be evaluated across each layer of the network as shown in the following figure. The technique proven to decrease provable error rates across a wide variety of machine learning models.



Formal Verification

Accurate testing and training are necessary steps to achieve robustness in machine learning models but largely insufficient to ensure that a system behaves according to its expectations. In large-scale models, enumerating all possible outputs for a given set of inputs (for example, infinitesimal perturbations to an image) is intractable due to the astronomical number of choices for the input perturbation. Formal verification techniques is an active area of research that focuses on finding efficient approaches to setting geometric bounds based on a given specification.

DeepMind recently developed a formal verification method that models the verification problem as an optimization problem that tries to find the largest violation of the property being verified. The technique iterates several times until it finds the correct bound which indirectly guarantees that there can be no further violation of a given property. Although initially applied to reinforcement learning models, DeepMind’s approach is very easy to generalize to other machine learning techniques.


The combination of testing, training and formal verification of specifications constitute three key pillars for the implementation of robust machine learning models. DeepMind ideas area a great starting point but we should expect these concepts to evolve into functional datasets or frameworks that enable the modeling and verification of specifications related to machine learning models. The path towards robust machine learning will also be enabled by machine learning.

Original. Reposted with permission.