Leduc-NL — Tabular CFR with Custom Bet Sizes

Step 2 of the progression. Same engine as Leduc, but bets come from a menu of pot fractions instead of fixed sizes. Back to index
Iter0
Hands0
Exploitability (mbb/g)
Infosets0

Section 1 · The game

Leduc-NL — same cards as Leduc, no-limit bet sizes

Same six cards (J Q K, two of each), same single public board. The change: instead of a fixed $2/$4 bet, the actor picks a bet size from a menu of pot fractions. This is the simplest possible toy version of "no-limit" betting, and it forces the algorithm to confront action abstraction for the first time.

Stack
10 chips each
+ 1 chip ante (2 in pot)
Bet menu
x · h · p · t · a
check, ½ pot, pot, 2× pot, all-in
Facing a bet
f · c
fold or call (no raise — kept simple)
Sizes that exceed stack
Collapse into all-in
If p = stack, only a is offered
Streets
Two — preflop & flop
Board card revealed between
Showdown
Pair beats high card
Pair = private rank matches board
What this teaches. Each decision now has up to 5 actions instead of 2–3. Some bet sizes become illegal as stacks shrink — your action set is state-dependent. The infoset count roughly doubles (288 → 552), but convergence per iteration is essentially identical to fixed-bet Leduc — see Section 5. What grows is wall-clock time per iter, not iteration count.

Section 2 · What changed from Leduc

The CFR algorithm is identical — only the action set changes

Leduc (fixed bets)
Bet menu: x · b · r
Two betting rounds, $2 then $4. Max 1 raise. ~288 infosets total. Strategy at any infoset is a length-2 or length-3 vector. Converges in seconds.
Leduc-NL (custom sizes)
Bet menu: x · h · p · t · a
Same two streets, same showdown. Bet size now depends on current pot, not a fixed amount. Stacks shrink as betting grows; oversized bets collapse into all-in. ~552 infosets (roughly 2× Leduc). Same CFR, same convergence rate per iteration.

Section 3 · Convergence

Exploitability over training iterations

Same exploitability metric — average best-response value for each player against the opponent's average strategy. Drops at the standard O(1/√T) rate.

Exploitability (mbb/g) 1/√T reference (theory) log scale on Y · linear on X

Section 4 · Average strategy

What sizings the bot picks at each infoset

PlayerCardR1 HistBoardR2 HistPot · StackStrategy
Action codes. x check · h ½ pot · p pot · t 2× pot · a all-in · c call · f fold

Section 5 · What you should see

Sanity checks for "is this converging?"

Why this is the right step. By solving Leduc-NL you've now seen the exact problem NLH presents — a bet menu that can be anything, with state-dependent legal sets — at toy scale where you can debug it. The next step leduc-nl-resolver.html takes one specific spot from this game and re-solves it on demand with a custom bet set chosen at runtime. That's the architecture Pluribus and DeepStack actually run.