Section 1 · The game
Leduc Hold'em — 6 cards, 2 streets, fixed bets
Leduc is the standard research toy poker game one rung above Kuhn. It has the structural pieces Kuhn lacks — a public board card, two betting rounds, real range evolution — while staying small enough to solve tabularly in a browser tab.
Deck
6 cards · J Q K (×2 each)
Suit doesn't matter
Private cards
1 each
Drawn without replacement
Public board
1 card · revealed after round 1
From the remaining deck
Betting
$2 round 1 · $4 round 2
Max 1 raise per round (cap)
Ante
1 chip per player
Pot starts at 2
Showdown
Pair beats high card
Pair = private card matches board
Why Leduc. ~936 information sets total. Tabular CFR converges to ε < 0.01 in roughly 5,000–10,000 iterations — a few seconds in JavaScript. Every CFR-related paper (DeepStack, Deep CFR, Pluribus) uses Leduc as a benchmark.
Section 2 · Convergence
Exploitability over training iterations
Exploitability = average of best-response values for each player against the opponent's current average strategy. At a Nash equilibrium it equals zero; CFR drives it toward zero at rate O(1/√T).
Exploitability (mbb/g)
1/√T reference (theory)
log scale on Y · linear on X
Section 3 · Average strategy
What the bot believes after training
| Player | Card | R1 Hist | Board | R2 Hist | Strategy | Visits |
|---|
Section 4 · What you should see
Sanity checks for "is this converging?"
- Exploitability falls on a log-linear path. A few thousand iterations should drop it from ~1000 mbb/g down toward single digits. If it plateaus or rises, something is wrong.
- K with K-on-board is a near-deterministic raise. Once a king is on the board and you also hold the king, you have the only possible pair — strategy should converge to almost-always raise.
- J without a pair is a near-deterministic fold to bets. Lowest card, no pair — there's nothing to call with except as a bluff-catch.
- Q is the bluff/value boundary. Q opens with mixed strategies that don't go to 0 or 1 — this is where the interesting GTO frequencies live.
Next step. Once exploitability sits below ~10 mbb/g, this blueprint is solid. The next page in the progression — leduc-nl-cfr.html — will replace the fixed bets with custom sizes (½ pot, pot, 2× pot, all-in), reusing the same engine and the same CFR loop.