The AI agents play a very simple version of the game, where the “seekers” get points whenever the “hiders” are in their field of view. The “hiders” get a little time at the start to set up a hiding place and get points when they’ve successfully hidden themselves; both sides can move objects around the playing field (like blocks, walls, and ramps) for an advantage.
The results from this simple setup were quite impressive. Over the course of 481 million games of hide-and-seek, the AI seemed to develop strategies and counterstrategies, and the AI agents moved from running around at random to coordinating with their allies to make complicated strategies work. (Along the way, they showed off their ability to break the game physics in unexpected ways, too; more on that below.)
It’s the latest example of how much can be done with a simple AI technique called reinforcement learning, where AI systems get “rewards” for desired behavior and are set loose to learn, over millions of games, the best way to maximize their rewards.
Reinforcement learning is incredibly simple, but the strategic behavior it produces isn’t simple at all. Researchers have in the past leveraged reinforcement learning among other techniques to build AI systems that can play complex wartime strategy games, and some researchers think that highly sophisticated systems could be built just with reinforcement learning. This simple game of hide-and-seek makes for a great example of how reinforcement learning works in action and how simple instructions produce shockingly intelligent behavior. AI capabilities are continuing to march forward, for better or for worse.