Poker has long been regarded as a "grand challenge" in the field of AI. The fact that the game involves hidden information—you don’t know your opponents’ cards—means that success requires bluffing and other tactics that don't apply to many other games. This has made poker resistant to AI techniques that have produced breakthroughs in other games. Researchers have been able to develop AIs that can beat one other player at no-limit Texas hold'em poker, but multiplayer has been too difficult to crack.
Known as Pluribus, Facebook's poker-playing AI was created by Facebook AI research scientist Noam Brown and Carnegie Mellon University professor Tuomas Sandholm. It was described in an academic paper published in the journal Science on Thursday.
Pluribus mastered multiplayer Texas hold'em by playing against previous versions of itself. This "self-play" learning method means that it was not fed any data from humans and it did not observe games played by other AI systems either.
"The AI starts from scratch by playing randomly and gradually improves as it determines which actions and which probability distribution over those actions lead to better outcomes against earlier versions of its strategy," Brown and Sandholm write in their paper. This strategy has been used before. For example, Google DeepMind used it to crack Go and OpenAI used it to master Dota 2.
AI researchers have been using games as a testbed for their AI agents for decades and in recent years there have been a number of breakthroughs thanks to advances in computing, better data sets, and more sophisticated AI techniques. Tech giants are investing heavily into the space in the hope that gaming breakthroughs will lead to breakthroughs in other areas such as healthcare, science, and energy.
"These innovations have important implications beyond poker, because two-player zero-sum interactions (in which one player wins and one player loses) are common in recreational games, but they are very rare in real life," write the researchers in a blog post. "Real-world scenarios—such as bidding in an online auction or navigating traffic—typically involve multiple actors."
Pluribus beat top-ranked professional players in both a five AIs plus one human player format and a one AI plus five professionals format. Among the professionals were Chris Ferguson, the World Series of Poker champion, and Darren Elias, an American pro who holds the record for winning the most World Poker Tour titles.
There was no money at stake but the researchers claim that if each chip was worth a dollar, Pluribus would have won an average of about $5 per hand and would have made about $1,000 per hour playing against five humans.
Pluribus is a supercharged version of another AI bot called Libratus, which beat human pros in two-player Texas hold’em games in 2017.
Unlike Libratus, Pluribus contains a new online search algorithm that can evaluate its options by searching a few moves ahead, as well as faster self-play algorithms.
The combination of these two factors made it possible to train Pluribus using relatively little processing power and memory. The researchers say they required just $150 worth of cloud computing resources. "This efficiency stands in stark contrast to other recent AI milestone projects, which required the equivalent of millions of dollars’ worth of computing resources to train," they write.
What the pros say
"Pluribus is a very hard opponent to play against," said Ferguson. "It’s really hard to pin him down on any kind of hand. He’s also very good at making thin value bets on the river and extracting value out of his good hands."
Elias added that Pluribus's main strength is its ability to use mixed strategies, which is what humans try to do.
"It’s a matter of execution for humans–to do this in a perfectly random way and to do so consistently," he says. "Most people just can’t. The bot wasn’t just playing against some middle of the road pros. It was playing some of the best players in the world."
While Pluribus might send a shiver down the spine of professional poker players that make a living from winning online tournaments, they don't need to worry about coming up against Pluribus in their next match.
"We're not open sourcing it…one reason we’re not is that poker is played commercially and we felt open sourcing could negatively impact the community," Facebook spokesman Ari Entin told Forbes.
Sam Shead, Forbes Staff