Building a Custom Bet-Size Solver with Deep CFR

Unverified

Personal study note — not an authoritative source. Written while learning Deep CFR and modern poker AI architecture. Not peer-reviewed, may contain mistakes, oversimplifications, or outdated claims. Do not cite this page as a reference. For anything load-bearing, follow the original papers in Section 7 (Brown & Sandholm, Moravčík et al., Ganzfried & Sandholm) — those are the actual sources.

The single insight

Training only produces a playbook for a simplified game. Real, custom bet sizes need fresh thinking at decision time.

Most beginners think: train Deep CFR → get a poker AI. That's wrong. Deep CFR alone gives you a blueprint for an abstracted version of poker with a fixed menu of bet sizes (~6 buttons). To handle any bet size in real play, the bot has to re-solve the current decision on the fly. Every elite NLH bot — Pluribus, DeepStack, Libratus — is built around this.

Section 1 · The forced compression

Why every poker AI shrinks the action menu

Poker has infinite bet sizes — you can bet $1.00, $1.50, $1.51, $1.5104… Computers can't track regret over infinite actions. CFR's regret vector has length num_actions, full stop. So every poker AI compresses the action space.

Real poker: "How much do you want to bet? Any number is fine."
AI poker: "Pick one — fold, call, ½ pot, pot, 2× pot, all-in."

This is action abstraction. It's the price of admission, and it's not the abstraction Deep CFR removes. Deep CFR removes card abstraction (no more hand-crafted hand buckets) but the bet-size menu is still hard-coded. That's why a blueprint alone can never tell you "0.37× pot here, please."

Section 2 · The architecture

Three pieces, played in sequence at every decision

PIECE 01 · OFFLINE

Blueprint

Train Deep CFR on the abstracted game (~6 bet sizes). Produces a memorized playbook. Done once, takes days.

PIECE 02 · ONLINE-FAST

Translator

Map the opponent's real bet (e.g. 73% pot) to the nearest abstract bets via pseudo-harmonic translation. Fast, lossy.

PIECE 03 · ONLINE-DEEP

Real-time solver

At your turn, run a fresh CFR on a depth-limited subgame with whatever bet sizes you want. This is where custom sizing lives.

Section 3 · The blueprint

Offline · Deep CFR

A giant memorized playbook

Run Deep CFR for days or weeks on a coarsely abstracted version of NLH. Typical action menu: {fold, call, 0.5×pot, 1×pot, 2×pot, all-in} — what OpenSpiel calls FCHPA. Card abstraction (k-means clustering of postflop hands into ~1k–5k buckets per street) is sometimes added on top to keep the infoset count tractable.

Output looks like: "Hand: A♠ K♠. Board: 7♥ 8♥ 2♣. Opponent bet pot. → Raise 3× pot 60% of the time, call 40%."

The blueprint converges to an approximate equilibrium of the abstracted game — not equilibrium of real NLH. It's a strong starting point, never the final answer.

Section 4 · The translator

Online · Action Translation

Mapping reality onto the playbook's menu

Opponent bets 73% of pot. Your blueprint only knows 50% and 100%. You translate. The standard recipe is pseudo-harmonic translation (Ganzfried & Sandholm, 2013) — it returns a probability of mapping the real bet to each of the two nearest abstract bets:

P(round down to A) = (B − x)(1 + A) / ((B − A)(1 + x))

where A < x < B in pot fractions. Sample, look up the blueprint at the translated infoset, act.

Intuition: "Pretend the opponent bet 50% with probability 0.4 and 100% with probability 0.6. Look up the blueprint's response in both worlds and blend."

Cheap, fast, lossy. Good enough for fast-path play; fails on edge cases.

Section 5 · The real-time solver

Online · Continual Re-solving

Fresh CFR on a tiny subgame, every decision

This is the actual answer to "custom bet sizes on any flop, turn, river." When it's the AI's turn, it ignores the blueprint and runs CFR from scratch on a depth-limited subgame, parameterized by:

Current public state (board, pot, stacks, betting history)
Both players' ranges — Bayesian posterior over hands given the action so far
The bet sizes you want available right now — this is where 0.37× pot lives
Leaf utilities estimated by either the blueprint or a learned value network

It runs for a few seconds, gets an approximate equilibrium for this spot with your bet menu, plays the action, throws the subgame away. Next decision: do it again.

DeepStack's leap: replace the blueprint entirely with a neural value function that estimates leaf utilities for any (board, ranges) input. That's what makes truly arbitrary bet sizing work without the blueprint as a crutch.

Section 6 · What you actually need to build

Decomposition of the project

Component	Purpose	Difficulty
Game engine	Apply actions, deal cards, compute showdown. Use OpenSpiel's universal_poker or PokerKit — don't write this yourself.	Medium
Blueprint trainer	Deep CFR on the FCHPA-abstracted game. Already in this repo's Deep CFR work, but slow.	Hard
Range tracker	Bayesian update of both players' hand-range distributions, conditioned on the betting history. Fiddly and conceptually subtle.	Hard
Real-time solver	CFR over a depth-limited subgame with whatever custom bet sizes you choose for this spot. The component that actually delivers your goal.	Hardest
Value network (optional)	Estimates counterfactual values at the depth limit so the re-solver doesn't have to roll out to the river. The DeepStack innovation.	Hardest

Without the real-time solver, you cannot achieve "any custom bet size on any street." A trained blueprint can never produce 0.37 × pot if it only ever trained on {0.5, 1.0, 2.0}. This is non-negotiable.

Section 7 · Reading path

Read these in order — each builds on the last

Brown & Sandholm 2019 — Deep CFR

Re-read this knowing it produces only the abstracted blueprint, not the final solver.

Moravčík et al. 2017 — DeepStack

Introduced continual re-solving + value networks for HUNL. The conceptual blueprint for everything modern.

Brown & Sandholm 2018 / 2019 — Libratus & Pluribus

Blueprint plus nested subgame solving for HUNL and 6-max. The production-grade architecture.

Ganzfried & Sandholm 2013 — Action Translation

The pseudo-harmonic mapping. Short paper, immediately practical for Piece 2.

Open-source references: slumbot and cfrm for blueprint + abstraction; OpenHoldem/PokerSolver for solver structure; OpenSpiel universal_poker.cc for the engine.

Section 8 · Expectation & first move

Be honest about scope, then start small

Realistic timeline

2 – 6 months focused work

For one person who already understands CFR. The neural net isn't the hard part — range tracking and the depth-limited solver are. Open-source references for the latter two are scarce.

First move

Build a Leduc re-solver in a weekend

Don't start with NLH. Leduc poker (6 cards) is small enough to write a tabular CFR + a depth-limited re-solver from scratch. Every concept above shows up at miniature scale, including range tracking. Once that works, scale the same architecture up to NLH — the math doesn't change, only the size.