Discover
/
Article

AI researchers solve game theory behind Stratego

DEC 08, 2022
To win against expert opponents, an algorithm makes tactical decisions often used by people.

DOI: 10.1063/PT.6.1.20221208a

41906/f1-1.jpg

An opening arrangement of the imperfect-information game Stratego.

zizou man, Wikimedia Commons, CC BY 2.0

A game like poker is one with imperfect information. Because each player has a set of starting cards that others can’t see, a player can bluff. Despite the added complexity of the game compared with perfect-information games like chess, in 2015 artificial intelligence (AI) researchers designed a game-winning strategy for Texas Hold ‘em, a variation of poker with 10164 possible game states.

That number, however, is but a fraction of the 10535 possible states for the board game Stratego. As in capture the flag, each player guards their flag and tries to capture their opponent’s. Two players each control 40 pieces: A piece can capture one of lower rank, but the specific ranks of the opponent’s pieces are unknown. (Only during an interaction between pieces do their ranks become known.) For an AI algorithm to win, it must make a series of long-term strategic moves and analyze a staggering 1060 times as many starting arrangements as a two-player game of Texas Hold ‘em.

The research laboratory DeepMind Technologies became famous in 2016 when its AlphaGo algorithm beat Go world champion Lee Sedol in a five-game match. Now a DeepMind team led by Julien Perolat, Bart De Vylder, and Karl Tuyls has developed an algorithm called DeepNash that plays Stratego at the level of a human expert.

DeepNash plays at a highly competitive level by finding a Nash equilibrium. In that approach, the algorithm finds a winning strategy by applying a set of tactical moves that can’t be exploited by the opponent.

Many games, including Stratego, can have multiple Nash equilibria, and oftentimes researchers will develop an opponent model that tracks all the possible game states that could form from various moves and the likelihood of the player making each of those moves on a particular turn.

But calculating all those possibilities in an imperfect-information game quickly becomes too computationally expensive. The researchers approached the challenge by building DeepNash as a deep-learning neural network, which improves the algorithm’s skill by having it play against itself, combined with a type of algorithm known as Regularized Nash Dynamics.

At its simplest, the Regularized Nash Dynamics algorithm is an iterative process consisting of three steps. In the first step, the game is transformed into a simplified decision-making situation in which two agents each take some action according to a regularization policy added to the game. Akin to an error-minimization technique, regularization discourages a newly learned set of actions from deviating too far from the policy assigned at the beginning of the iteration. Step two takes the simplified, transformed game and converges on a possible winning solution. In the third and final step, the previous transformation of the game is updated with the solution found in step two.

When playing Stratego, DeepNash made several humanlike tactical decisions. In Stratego, players can often gain an advantage by knowing more private information than their opponents, even if they have fewer pieces on the board. In some games, DeepNash purposefully chose not to move certain pieces, which left its opponent unsure of the rank of those pieces. Over the thousands of games played, DeepNash won 97% of the time against other AI bots and 84% of the time against human experts. (J. Perolat et al., Science 378, 990, 2022 .)

More about the Authors

Alex Lopatka. alopatka@aip.org

Related content
/
Article
The astrophysicist turned climate physicist connects science with people through math and language.
/
Article
As scientists scramble to land on their feet, the observatory’s mission remains to conduct science and public outreach.

Get PT in your inbox

Physics Today - The Week in Physics

The Week in Physics" is likely a reference to the regular updates or summaries of new physics research, such as those found in publications like Physics Today from AIP Publishing or on news aggregators like Phys.org.

Physics Today - Table of Contents
Physics Today - Whitepapers & Webinars
By signing up you agree to allow AIP to send you email newsletters. You further agree to our privacy policy and terms of service.