

Western Army Chess. Photo credit: Lost in the Midwest/Alamy
Playing chess, playing games, in the game with human beings, artificial intelligence (AI) is constantly growing. Recently, DeepNash, an AI agent from the British company DeepMind, has reached the level of professional human players in Stratego. The results were published in Science on December 1.
In fact, before DeepNash conquered Stratego, there were reports that AI was able to play a similar classic board game called “Power Diplomacy”. This is a game with diplomatic depth, where players cooperate and compete with each other, negotiating and making allies when necessary.
"In recent years, the speed at which AI has mastered games that are fundamentally different is quite astonishing." Michael Wellman, a computer scientist at the University of Michigan, said that Stratego and "power diplomacy" are very different games, but they are both extremely challenging, unlike previous AI games. The games that have been mastered are also quite different.
Stratego is a board game that requires strategic thinking in the absence of information, similar to Chinese military chess. It is much more complex than chess, go or poker that AI has mastered before.
In the game, each side has 40 chess pieces, and each other cannot see the true "identity" of the chess pieces. The two sides take turns to move the chess pieces to eliminate the opponent's chess pieces, and the party that wins the opponent's flag or destroys all the movable chess pieces finally wins. Therefore, players need to make strategic deployments, collect information, and play games with each other.
There are 10535 possible layouts in a game of Stratego. In comparison, Go has 10,360 possible layouts. Furthermore, in Stratego, the AI needs to reason about more than 1,066 deployment strategies for the opponent, which dwarfs the initial 106 possible situations in Texas Hold’em.
"The sheer number and complexity of possible outcomes in Stratego means that algorithms that perform well in games with perfect information, and even algorithms that perform well in poker, don't work in this game." DeepMind Researcher Julien Perolat said.
So Perolat and his colleagues developed DeepNash, which pays tribute to the American mathematician John Nash who proposed Nash equilibrium.
Nash equilibrium is a concept of solution in game theory, which refers to a strategy combination that satisfies the following conditions: Any player who unilaterally changes his strategy under this strategy combination (other players' strategies remain unchanged) will not increase his own income .
DeepNash combines reinforcement learning algorithms with deep neural networks to find Nash equilibria. Reinforcement learning involves finding the best policy for each state of the game. To learn the best strategy, DeepNash has played 5.5 billion games with itself.
In April, DeepNash played a two-week match against human Stratego players on the online gaming platform Gravon. After 50 games, DeepNash currently ranks third among all Gravon Stratego players.
"Our research shows that complex games like Stratego that involve imperfect information do not need to be solved by search techniques." Team member and DeepMind researcher Karl Tuyls said, "This is a big step forward for AI."
And the team of Meta AI researcher Noam Brown of Pluribus, who reported on the poker-playing AI in 2019, set their sights on a different challenge: building an AI that could play "power diplomacy."
"Mighty Diplomacy" is a game for up to 7 players, each representing a major European power before World War I. The goal of the game is to control supply centers by moving troops. Importantly, the game requires personal communication and cooperation between players, rather than a two-player game like Go or Stratego.
"When more than two people play a zero-sum game, the idea of Nash equilibrium is no longer useful for the game." Brown said that they successfully trained the AI Cicero. In a paper published Nov. 22 in Science, the team reports that, across 40 games, "Cicero scored more than twice as well as human players on average, ranking among participants who played more than one game. first 10%".
Brown believes that gaming AI that can interact with humans and explain suboptimal or even irrational human behavior could pave the way for its real-world applications.
(The original title was "Chess, Go, this time it is military chess, artificial intelligence defeated human players again")
Related paper information:
https://doi.org/10.1126/science.add4679