Inside postflop-solver

An open-source Rust library that out-runs PioSOLVER — algorithm, architecture, engineering Back to index
Language Rust
Algorithm Customized DCFR
Abstraction None (real cards)
License AGPLv3
Status Suspended Oct 2023
Author Wataru Inariba
The single takeaway

One developer, ~10k lines of Rust, AGPL-licensed — and per the author's benchmarks, faster than PioSOLVER and GTO+. The whole thing is a customized Discounted CFR running over the actual 1326-hand combo space, no card abstraction.

Commercial postflop solvers cost hundreds of dollars and run on proprietary code most users never see. This repo is the entire engine — game tree builder, hand evaluator, DCFR loop, isomorphism reductions, SIMD-tuned hot paths, optional 16-bit compression — sitting in src/. It's the backend of WASM Postflop (in-browser) and Desktop Postflop, and it's the closest thing to a reference implementation of a modern NLH solver that the public has.

Section 0 · The headline numbers

Scope of what one Rust crate is doing

1326
Hand combos per player, per node
0%
Card abstraction (full combo resolution)
DCFR
α = 1.5, β = 0, γ = 3, reset@4ⁿ
16-bit
Optional compressed storage (~16× memory savings)

A typical full-turn solve on a single spot tracks per-action regret and strategy for up to ~1300 hand combos × ~6 actions × thousands of decision nodes. Multiply by 4 bytes (f32) or 2 bytes (i16) and you're at multiple gigabytes per spot. That's why "memory layout" is a real engineering concern, not a footnote.

Section 1 · What it does

Inputs in, GTO strategy out

You hand it a fully specified postflop spot — ranges, board, stacks, allowed bet sizes — and it returns, for every reachable decision node and every hand a player could hold there, the mixing frequencies that approximate the Nash equilibrium of the configured subgame.

Input : (OOP range, IP range, flop[, turn[, river]], pot, stacks, bet-size menu, rake)
Output : strategy[node][action][hand]   →   play frequency
           expected_values[node][hand]   →   per-hand EV (chips)
           equity[node][hand]   →   raw equity vs opponent range

Where most beginners assume "GTO solver = magic," this library is explicit about what it actually computes: an ε-Nash equilibrium of an abstracted-action subgame. The abstraction is purely over bet sizes (you choose a menu like 60%, geometric, all-in), not over cards — every one of the 1326 hand combos is tracked individually.

Section 2 · The algorithm

Customized Discounted CFR

The algorithm is Discounted CFR (Brown & Sandholm, 2019) — a variant of CFR that down-weights old regrets and old strategies so the running averages don't carry early-iteration noise. The README admits to two deviations from the paper's recommended parameters, both visible in src/solver.rs:

ParamRolePaperThis solverWhy it matters
α Discount on past positive regrets 1.5 1.5 ✓ Good actions accumulate fast
β Discount on past negative regrets 0 (weight → ½) 0.5 (constant) ≡ β=0 Bad actions decay by half each pass
γ Discount on past cumulative strategy 2 3 More aggressive recency bias
reset Cumulative-strategy reset none at iterations 1, 4, 16, 64… Effectively a windowed average
order Player update scheduling simultaneous alternating CFR+ style, helps convergence
Where this lives: src/solver.rs lines 11–37 build the per-iteration DiscountParams struct. The pow-of-4 reset is computed via x.leading_zeros() ^ 31 & !1 — a one-line bit-twiddle that yields the largest power of 4 ≤ current iteration.

For background on CFR / CFR+ / DCFR / MCCFR specifically, see CFR Simulation, CFR+, DCFR, and MCCFR on this site.

Section 3 · The pipeline

Build · Solve · Query — three stages, one struct

STAGE 01 · BUILD
Tree + game state
CardConfig + TreeConfig → ActionTree::new() enumerates every action sequence. PostFlopGame::with_config attaches ranges, hand strengths, and isomorphism maps.
STAGE 02 · SOLVE
DCFR iteration
allocate_memory() reserves per-node regret/strategy arrays. solve() runs alternating-update DCFR, checking exploitability every 10 iters. finalize() emits the time-averaged strategy.
STAGE 03 · QUERY
Read-only inspection
Navigate via play() / back_to_root(). strategy(), expected_values(), equity(), private_cards() expose the GTO frequencies, EVs, and CFVs at every node.
End-to-end in 4 lines:
let action_tree = ActionTree::new(tree_config).unwrap();
let mut game = PostFlopGame::with_config(card_config, action_tree).unwrap();
game.allocate_memory(false);
let exploit = solve(&mut game, 1000, 0.005 * pot, true);

Section 4 · Inside the recursion

Three node types, three things to do

The solver's core is solve_recursive() — a DFS over the action tree carrying down a vector of counterfactual reach probabilities (probability the opponent would have arrived at this node, per opponent hand). What happens at each node depends on its type:

Decision

Player to act

Compute the current strategy from cumulative regrets via regret matching (max(R,0)/Σ). Recurse into each action to get per-action counterfactual values. Update cum_regret (weighted by α or β depending on sign) and cum_strategy (weighted by γ). Return the strategy-weighted sum as the node's CFV.

Chance

Card dealt

Scale incoming reach by 1 / chance_factor (accounting for the random card draw blocked by both ranges). Recurse into each possible card — but skip isomorphic suits and reuse their result via a precomputed swap-list. Sum children to get the node's CFV.

Terminal

Fold / Showdown

Fold: ±pot share weighted by opponent reach. Showdown: pre-sorted hand evaluator (hand_table.rs) compares every player hand against every opponent hand in O(n) instead of O(n²), using SIMD-friendly access patterns.

Note: the recursion is fully deterministic — no Monte Carlo sampling. Every chance outcome and every legal action is traversed every iteration. The library has no MCCFR variant; the speed comes from cheap-per-iteration work, not from sampling.

Section 5 · The performance tricks

Why a Rust hobby project can out-run paid solvers

SIMD

Hand-tuned vector instructions

Hot loops (regret updates, weighted sums, hand evaluation) are written so the compiler emits SIMD on x86, ARM, and WASM (v128). The author explicitly reviews the assembly output to verify the kernels vectorize.

Multithreading

Rayon parallel recursion

At every node, the DFS into child subtrees runs in parallel across CPU cores via Rayon's into_par_iter. Each thread gets its own subtree to recurse through; the regret/strategy arrays are atomically scattered back.

Isomorphism

Skip equivalent suits

If the flop is monotone (or has any suit symmetry), the solver computes the turn/river for only one representative suit and remaps the result for the equivalents. On a monotone flop this is a 3× win for free.

Precision

f32 storage, f64 summation

Everything that's stored — regrets, strategies, CFVs — is f32. But every running sum is accumulated in f64 to avoid catastrophic cancellation. Half the memory traffic, none of the numerical drift.

Compression

Optional 16-bit storage

Per-node regret/strategy can be stored as i16 + one f32 scaling factor per array. ~16× memory savings, ~1 bit of precision lost. Lets a 5 GB turn solve fit in 350 MB.

Custom alloc

Stack-based scratch allocator

Optional custom-alloc feature (nightly Rust) replaces the global allocator's per-recursion-call Vec::with_capacity with a bump allocator. Solving paths allocate millions of tiny temporary vectors; this avoids hammering jemalloc/system malloc.

Bunching

Correct multi-fold modeling

Optionally accounts for the bunching effect — that folded preflop players removed cards non-uniformly, biasing the postflop deck. Most commercial solvers either ignore this or use heuristics; this one counts combinations correctly. Slow when enabled.

Engine

No card abstraction

Every hand combo is tracked individually. The library refuses to bucket hands — it just makes the underlying combo-level computation fast enough that you don't need to. This is the cleanest possible solver semantics: no information lost to abstraction.

Section 6 · Bonus use-case

It quietly emits perfect value-network training data

Every solved node persists the per-hand counterfactual values used during the DCFR update. The public API expected_values_detail(player) exposes them, post-normalized to per-hand EVs in chips. This is exactly the target signal that DeepStack and ReBeL train neural value networks against.

Input : (board, OOP range, IP range, pot, stacks)
Output: oop_cfv [1326-vector], ip_cfv [1326-vector]

So a dataset generator built on this library looks like: sample a public state → solve to low exploitability → walk to nodes of interest → call expected_values_detail(0) and expected_values_detail(1) → write tensors to disk. Repeat at scale, train a value net, you have a depth-limited re-solver. (See Building a Custom Bet-Size Solver for the surrounding architecture.)

Section 7 · So why is this impressive

One person, one repo, one strong claim — and the code earns it

Section 8 · References

Where to read more

01

b-inary / postflop-solver

The repository itself. README documents the algorithm choices; src/solver.rs is the entry point for the DCFR loop.

02

Brown & Sandholm 2019 — Solving Imperfect-Information Games via Discounted Regret Minimization

The DCFR paper. Provides the α/β/γ scheme this solver builds on. Recommends γ=2; this codebase uses γ=3.

03

WASM Postflop

The in-browser front-end. Compiles this library to WebAssembly + WASM SIMD. Live at wasm-postflop.pages.dev.

04

Desktop Postflop

Native (Tauri) front-end on top of the same engine. Demonstrates the same code path used in apps with end users.

05

Repo issue #46 — Suspension announcement (Oct 2023)

The author's own statement about pausing open-source development to build a commercial product. Useful context for why this repo is frozen as of late 2023.

06

DCFR (this site) · CFR+ · Vanilla CFR

Toy-scale walkthroughs of the algorithm family this solver lives in. Start here if "discounted regret matching" doesn't yet read as English.