How Google uses Reinforcement Learning to Train AI Agents in the Most Popular Sport in the World
Researchers from the Google Brain team open sourced Google Research Football, a new environment that leverages reinforcement learning to teach AI agents how to master the most popular sport in the world.
Football (soccer for Americans) is by far the most popular sport in the world. With over 4 billion fans worldwide, football has proven to transcend generations, geo-political rivalries and even war conflicts. That passion has transferred into the video game space, in which games like FIFA regularly ranked among the most popular video games worldwide. Despite its popularity, football is one of the games that have proven resilient artificial intelligence (AI) techniques. The complexity of environments like FIFA often result on a nightmare for AI algorithms. Recently, researchers from the Google Brain team open sourced Google Research Football, a new environment that leverages reinforcement learning to teach AI agents how to master the most popular sport in the world. The principles behind Google Research Football were outlined in a research paper that accompanied the release.
A quick look at the game dynamics of football reveals the obvious challenges for AI agents. The game requires coordinated actions for 11 players that evaluate the actions of another 11 players in the opposite team. The strategies regularly change play by play and the rules are not exactly deterministic. Additionally, the richness and complexity of rules/strategies such as goal kicks, sidekicks, corner kicks, both yellow and red cards, offsides, handballs, penalty kicks, and substitutions can confuse the most hardcore fans so imagine modeling those in an AI algorithm. Not surprisingly, traditional supervised learning techniques often fall short when applied to football environments. But what if AI agents could learn to play football by simply playing? That was the strategy followed by the Google Brain team.
Reinforcement Learning for Football
The idea of applying reinforcement learning to football environments seems intuitive. After all, reinforcement learning has been behind some of the biggest breakthroughs in AI from the creation of AlphaGo to surpassing humans in complex multiplayer environments such as Dota2 or Quake III. Reinforcement learning provides a model in which AI agents can master the rules of an environment by trial and error instead of predefined training datasets. In general, games provide a great environment for reinforcement learning agents as they test new ideas in a reproducible manner so the idea of applying those principles to football seem intuitive. However, creating a reinforcement learning for football is far from trivial and it comes with a very unique set of challenges:
- Complexity: Compared to most reinforcement learning environments in the market, football doesn’t focus on solving a sequence of simple task but a coordinated group of complex tasks.
- Stochasticity: Most reinforcement learning environments are fundamentally deterministic and don’t deal well with randomness. Football, just like other scenarios like self-driving cars, is subjected to different sources of stochasticity which drastically increases the complexity of the algorithms.
- Multi-Player: Football is what is known as a cooperative, multi-agent learning scenario which essentially describes an environment in which agents need to collaborate and compete against each other in order to accomplish a series of goals. From the reinforcement learning standpoint, this type of environment offers the highest degree of complexity for AI agents.
- Fully Observable but Continuous: A football game is what is known in AI theory as a fully observable environment which can be visualized in its totality. This contrasts with environments such as Dota2, Quake III or even Poker in which the agents are unaware of the complete game environment. However, the observability in football is challenged by the continuous nature of the game in which there are almost infinitive moves for any given state in order to accomplish a specific goal.
- Expensive: Simulating complex environments such as football often requires expensive GPU architectures which result cost-prohibited for most research labs.
Those are some of the key reasons why football has managed to elude most AI algorithms. The Google Brain team balanced those challenges with state of the art reinforcement learning models that help master football in a very unique way.
Google Research Football
The Google Research Football project is a reinforcement learning environment in which agents learn to play football by simply playing. The current version of the platform is based on three fundamental components:
- Football Engine: A highly-optimized game engine that simulates the game of football.
- Football Benchmarks: A versatile set of benchmark tasks of varying difficulties that can be used to compare different algorithms.
- Football Academy: A set of progressively harder and diverse reinforcement learning scenarios.
The Football Engine is an advanced football simulation based on the popular Gameplay Football environment. The engine simulates a complete football game, and it accepts input actions from both teams which includes the most common football aspects, such as goals, fouls, corners, penalty kicks, or offsides.
From the reinforcement learning standpoint, the Football Engine includes a series of relevant properties that are worth highlighting:
- State & Observations: The Football Engine models a game as a combination of state and observations. In that context, state is defined as the complete set of data that is returned by the environment after actions are performed. On the other hand, observations are defined as any transformation of the state that is provided as input to the control algorithms.
- Actions: The Football Engine models a series of actions that are available to any agents in any given state. Actions include standard move actions (up, down, left, right), and different ways to kick the ball (short and long passes, shooting, and high passes that can’t be easily intercepted along the way). Also, players can sprint, which affects their level of tiredness.
- Stochasticity & Determinism: The Football Engine can run in either stochastic or deterministic mode. The former, which is enabled by default, introduces several types of randomness: for instance, the same shot from the top of the box may lead to a different number of outcomes. In the latter, playing a fixed policy against a fixed opponent always results in the same sequence of actions and states.
- OpenAI Gym Compatibility: The Football Engine is out of the box compatible with the widely used OpenAI Gym API which streamlines its adoption in other research environments.
The current version of the Football Engine was written in C++, allowing it to be run on off-the-shelf machines, both with GPU and without GPU-based rendering enabled. This allows it to reach a performance of approximately 25 million steps per day on a single hexa-core machine.
The Football Engine provides the fundamental building blocks for researchers to try new ideas for mastering football. However, we still need a well-establish mechanism for objectively evaluating the viability of those ideas. The Football Benchmark evaluates different strategies against a predefined set of tasks. Functionally, the goal in these benchmarks is to play a “standard” game of football against a fixed rule-based opponent that was hand-engineered for this purpose. The current version of Football Benchmark provides three versions: the Football Easy Benchmark, the Football Medium Benchmark, and the Football Hard Benchmark, which only differ in the strength of the opponent.
The Google Brain team tested Football Benchmarks with two states of the art reinforcement learning algorithms: DQN and IMPALA. Below you can see the comparison for two different reward models (scoring and checkpoint). We can see how increasing the level of difficulty requires the models to use a larger number of steps.
The Football Academy
The Football Engine allow us to simulate and entire football game while the Football Benchmark allow to evaluate different reinforcement learning models against well-established challenges. The final step might be to learn how to train reinforcement learning agents for the Football Benchmark. That’s the role of the Football Academy, a diverse set of scenarios of varying difficulty which its main goal is to allow researchers to get started on new ideas quickly, and iterate on them. The Football Academy includes diverse settings where agents have to learn how to score against an empty goal, how to run towards a keeper, how to quickly pass in between players to defeat the defense line, or how to execute a fast counterattack. For instance, below we can see the results of evaluating the IMPALA algorithm across the different scenarios of the Football Academy.
To see how the Football Engine, Benchmark and Academy come together, you can watch the following video released by the Google Brain team.
Original. Reposted with permission.
Bio: Jesus Rodriguez is a technology expert, executive investor, and startup advisor. A software scientist by background, Jesus is an internationally recognized speaker and author with contributions that include hundreds of articles and presentations at industry conferences.
- The Emergence of Cooperative and Competitive AI Agents
- Reinforcement Learning: The Business Use Case, Part 1
- Reinforcement Learning: The Business Use Case, Part 2