The first day of the Brains vs. AI poker tournament is in the books, and the Libratus bot from Carnegie Mellon University emerged as the clear winner, collecting $81,716 to the humans $7,228. Both the players and Libratus’ creators cautioned that it was still too early to make a judgement call about who might win the 20-day tournament. But it’s clear that this year’s AI has made some major improvements on the 2015 system, Claudico, which ended up losing to humanity.
“I felt like Libratus is playing a lot better than Claudico did in the previous challenge,” wrote Jason Les, one of the four poker pros, in an email to The Verge. “Preflop, it is using a widely mixed strategy,” of small bets, calls, and very large wagers. “This is something it would be extremely difficult/impossible for a human to balance correctly in their mind but Libratus appears to be doing it well so far.”
“The thing that impressed me the most is how unpredictable and random it was able to maneuver post-flop,” said Jimmy Chou, another pro. ”It also seems to understand some advanced strategies that many top regulars implement in their own game. We lost the battle today but we are looking to strike back tomorrow!”
Sam Ganzfried, a professor who helped develop some of CMU’s earlier poker bots, and who now teaches at Florida International University, said Libratus appears to have solved for two key weaknesses that humans exploited in the past. The first, card removal, means that the system now takes its own cards into consideration when deciding to bluff, allowing it to pick stronger opportunities for leveraging weak hands. The second, off-tree problems, means that Libratus no longer approximates the size of its opponents bets, a technique that made the game simpler to play, but sometimes caused Claudico to badly misjudge the size of the pot.
In general, Libratus is more precise than its predecessors. “In particular, we developed a new technique called “nested endgame solving” which allows the bot to compute new strategies in real time that best counters the humans’ actions, while still guaranteeing that our strategy is balanced so the humans can’t take advantage of us,” said Noam Brown, a PhD student with CMU who helped design the system. “One of the weaknesses of Claudico was that it combined similar hands into a single “bucket” and treated them all the same. So for example, it might play a queen-high flush exactly the same as a king-high flush.”
After playing a few rounds against the AI, poker pros were able to sense and exploit this style of play. “This allowed Claudico to reduce the size of the game by a factor of 100,000, but it also meant its strategy was not as fine-tuned, because it could not notice subtle strategic differences in certain situations,” wrote Brown. “The humans picked up on this were able to take advantage of it. Our nested endgame solving technique means Libratus does not need to do this bucketing. Instead, it can determine a distinct strategy for each unique situation.”
If Libratus can beat some of the world’s best humans in No-Limit Texas Hold’em, it will be a milestone in AI research comparable to Deep Blue’s triumph in chess and AlphaGo’s victory last year in Go. In fact, it may be more significant, as most real-world problems are closer to a game of poker, a delicate dance between multiple people, all lacking a perfect view of the entire situation. As John Von Neumann, one of the pioneers of game theory put it, “Real life is not like that. Real life consists of bluffing, of little tactics of deception, of asking yourself what is the other man going to think I mean to do.”