04. Data Plane
The deterministic hot path: lorenz-amm constant-product math, lorenz-graph negative-cycle detection, lorenz-dex pool model and account decoders, and lorenz-stream transport-agnostic ingest.
The data plane is the deterministic hot path:
ingest → detect → size → build → submit → execute. This chapter covers the four
host crates that implement the analysis half of it: lorenz-amm (pricing),
lorenz-graph (detection), lorenz-dex (pool model + decoders), and
lorenz-stream (ingest abstraction). Building/submitting the transaction is
roadmap; executing is the on-chain program (chapter 06).
A recurring theme: the graph finds candidates using f64 edge rates; exact,
size-aware profit is always recomputed with integer math before anything is
acted on. The detector narrows the search space; it does not certify profit.
lorenz-amm: constant-product math ✅
The smallest fully-real, fully-tested unit of the system. Given reserves and a fee it returns exactly how much comes out of a swap, in integer arithmetic that matches what the chain computes at settlement.
CpmmReserves
CpmmReserves { reserve_in: u128, reserve_out: u128, fee: Bps }One direction of a constant-product (x·y=k) pool.
amount_out(amount_in: u128) -> Option<u128>
Uniswap-v2-style formula with the fee applied to the input:
in_after_fee = amount_in * (10_000 - fee_bps) / 10_000
out = reserve_out * in_after_fee / (reserve_in + in_after_fee)- All multiplications are
checked_*againstu128overflow (returnsNone). - Degenerate input (
reserve_in == 0 || reserve_out == 0 || amount_in == 0) returnsSome(0)rather than a misleading panic orNone. - Integer floor division means the realized output is exactly the on-chain value.
Analytics helpers (f64, never settlement)
spot_price() -> f64: marginal pricereserve_out / reserve_in, ignoring fee and impact. Analytics only.effective_rate(amount_in) -> Option<f64>:out/inactually realized for a size, including fee + impact. This is what weights graph edges.
Property tests (the invariants any correct AMM must obey)
proptest pins down, over wide random inputs:
| Property | What it guarantees |
|---|---|
output_bounded_by_reserve | Output can never exceed the output reserve (out < r_out). |
monotonic_in_input | More input never yields strictly less output. |
no_free_money_single_hop | With symmetric reserves, out ≤ in for any positive fee/impact: no free money from one hop. |
Plus known-value tests (1000 in on equal 1_000_000 reserves, zero fee →
999), a fee-reduces-output test, and an empty-reserves test.
This crate is invariant O1 (integer settlement math) in Invariants.
lorenz-graph: negative-cycle arbitrage detection ✅
Models the market as a directed graph: nodes are tokens, an edge A → B means
"you can swap A into B" at effective rate r (out per in). Chaining swaps
multiplies rates, so a cycle is profitable exactly when the product of its rates
exceeds 1.
The transform that makes it tractable
Take -ln(r) as the edge weight. The product condition becomes additive: a
profitable cycle is a cycle whose summed weights are negative. That is the
classic Bellman-Ford negative-cycle problem.
Types
| Type | Fields |
|---|---|
Edge | from: TokenId, to: TokenId, pool: PoolId, rate: f64 |
Cycle | edges: Vec<Edge> (traversal order, closed loop), product: f64 (>1.0 ⇒ gross-profitable) |
ArbitrageGraph | interned tokens, index: HashMap<TokenId,usize>, edges |
add_edge
Interns the endpoints and pushes the edge, but ignores edges with
non-positive or non-finite rate (rate <= 0.0 || !rate.is_finite()), since
those represent "no path" and would break the -ln(r) transform.
find_arbitrage() -> Option<Cycle>
- Compute weights
-rate.ln()for every edge. - Initialize all distances to
0.0. This is equivalent to attaching a virtual source with zero-weight edges to every node, so a negative cycle anywhere in the graph is detectable regardless of connectivity. - Relax all edges
ntimes (n = token count), trackingpred_edge[v]and the last node updated. An epsilon of1e-12guards the float comparison. If a round relaxes nothing, the graph has converged with no negative cycle → early returnNone. - A node still relaxing after
nrounds sits on (or reaches) a negative cycle. Walk predecessorsntimes to step into the cycle proper. - Reconstruct the cycle by following predecessor edges back to the start, with a
safety valve (
len > n+1 ⇒ None) against pathological reconstruction.
Returns the cycle with its rate product.
Honest caveat (stated in the crate)
Edge rates are taken at a single quote size, so this finds candidate
cycles. Exact, size-aware profit is recomputed by the executor/backtester against
real pool math (lorenz-amm) before anything is acted on. This module narrows the
search space; it does not certify profit.
Tests cover: balanced rates (product 1) ⇒ none; a triangular arb
(1·1·1.1 = 1.1) found and reconstructed as a closed loop; non-positive rates
ignored; empty graph ⇒ none.
lorenz-dex: pool model, quoting, decoders ✅ / 🔭
This crate is the convergence point of the data plane. It does three things:
- Defines the
Poolmodel and theQuotertrait. - Prices both constant-product (
CpmmPool) and single-tick concentrated-liquidity (ClmmPool) pools. - Decodes raw on-chain account bytes into priceable pool snapshots
(
decodermodule).
The Quoter trait and CpmmPool
trait Quoter {
fn quote(&self, token_in: &TokenId, amount_in: u128) -> Option<u128>;
fn dex(&self) -> Dex;
fn pool_id(&self) -> &PoolId;
}CpmmPool { id, dex, token_a, token_b, reserve_a, reserve_b, fee } orients its
reserves for whichever side is being spent (reserves_for) and delegates to
lorenz-amm. Its edges(probe_size) -> Vec<Edge> emits both directed edges
(A→B and B→A) at a probe size, so the detector can consider swapping either way.
ClmmPool: single-tick concentrated liquidity ✅ (f64)
A CLMM pool stores a current sqrt_price (Q64.64 fixed point) and active
liquidity, not vault reserves. Within the current tick the math is exact and
simple. Let s = sqrt_price, L = liquidity:
token0 (A) in, dx: 1/s' = 1/s + dx/L ; out_B = L * (s - s') (price down)
token1 (B) in, dy: s' = s + dy/L ; out_A = L * (1/s - 1/s') (price up)with the fee applied to the input first (dx = amount_in * (1 - fee_fraction)).
Q64 = 2^64 converts the fixed-point sqrt_price_x64 to a real.
Honest scope: this is the single-tick swap, accurate only while the trade
does not cross into a neighbouring initialized tick (where L changes). That is
exactly the right tool for the detector (it only needs effective rates to surface
candidates). The math here uses f64; full Q64.64 integer math with tick
crossing is roadmap. The executor/backtester recompute exact output before
acting.
The unifying Pool enum
enum Pool { Cpmm(CpmmPool), Clmm(ClmmPool) }Pool forwards edges, quote, id, dex to the inner variant, so the
streaming and detection pipeline treats both kinds uniformly.
The decoder module: raw bytes → pool snapshot
This is where the deterministic core meets untrusted external data, behind traits so the live engine, replay, and tests share one definition of "what a pool is".
How reserves actually work (CPMM): the pool account does not store reserves. It stores the two vault token-account addresses; the reserves are the SPL balances in those vaults. Decoding is therefore two steps:
PoolAccountDecoder::decode_accounts(bytes) -> PoolAccountsreads the pool account (mints, vaults, fee).- The caller fetches the two vault token accounts, reads each balance with
spl::token_account_amount, and callsPoolAccounts::assemble(id, reserve_a, reserve_b) -> CpmmPool.
Low-level readers are all bounds-checked (read_pubkey, read_u16, read_u128)
and return DecodeError::TooShort { expected, got } on truncation.
Implemented decoders (real offsets, fixture-tested) ✅
| Decoder | Kind | Offsets | Fee |
|---|---|---|---|
spl::token_account_amount | SPL token account | 165-byte account, amount u64 at offset 64, mint at 0 | n/a |
RaydiumAmmV4Decoder | CPMM | coin_vault 336, pc_vault 368, coin_mint 400, pc_mint 432 | default 25 bps |
RaydiumCpmmDecoder | CPMM | token0_vault 72, token1_vault 104, token0_mint 168, token1_mint 200 | default 25 bps (real fee lives in the separate amm_config account; caller refines) |
WhirlpoolDecoder | CLMM | (incl. 8-byte Anchor discriminator) fee_rate 45 (u16), liquidity 49 (u128), sqrt_price 65 (u128), mint_a 101, mint_b 181; len 653 | fee_rate / 100 (Whirlpool fee is hundredths of a bp → bps) |
PoolAccounts carries { dex, mint_a, mint_b, vault_a, vault_b, fee };
ClmmPoolAccounts carries { dex, mint_a, mint_b, sqrt_price_x64, liquidity, fee }
and assembles directly (no vault fetch needed).
Roadmap decoders (explicit NotImplemented) 🔭
Generated by the roadmap_decoder! / roadmap_clmm_decoder! macros, real types
that openly report DecodeError::NotImplemented(dex) rather than misrepresenting
non-CPMM math as constant-product:
- CPMM-shaped seams:
MeteoraDammDecoder,PumpAmmDecoder,SolfiDecoder,VertigoDecoder. - CLMM seams:
RaydiumClmmDecoder,MeteoraDlmmDecoder(need sqrt-price + tick arrays / bin layouts).
b58(&RawPubkey) -> String base58-encodes a raw 32-byte key into a TokenId
string, keeping the crate dependency-light (no Solana SDK).
lorenz-stream: transport-agnostic ingest ✅ / 🟡
Turns a sequence of pool-state updates into detected candidates, independently of where the bytes come from.
Core abstraction
struct PoolSnapshot { slot: u64, pools: Vec<Pool> } // all watched pools at a slot
trait PoolSnapshotSource {
fn next_snapshot(&mut self) -> Option<PoolSnapshot>;
}Naming note: this PoolSnapshot (runtime: a slot + many pools) is distinct
from lorenz-backtest::PoolSnapshot (a single serializable pool row). See chapter
07.
detect_stream: the shared analysis step
fn detect_stream<S: PoolSnapshotSource, F: FnMut(u64, Cycle)>(
source: S, probe_size: u128, on_cycle: F
) -> usizeFor every snapshot it: builds a fresh ArbitrageGraph, adds pool.edges(probe_size)
for each pool, runs find_arbitrage, and invokes on_cycle(slot, cycle) on a
hit, returning the count detected. This is the exact analysis step the live
engine runs, just fed from an abstract source.
Two sources
replay::ReplaySource✅: a real, deterministic source over recorded snapshots (aVecDequepopped front-to-back). Used in tests and conceptually by the backtester; genuinely drives the detector end-to-end.geyser::GeyserSource🟡: the production transport seam for a live Yellowstone/Geyser gRPC subscription.GeyserConfig { endpoint, x_token, commitment };connect()returnsStreamError::NotImplemented(...). The module documents the exact wiring (connect → subscribe to pool + vault accounts → decode withlorenz-dex→ assemble → emit snapshot). Only the gRPC transport is environment-specific; the decode/assemble steps already exist and are tested.
StreamError has NotImplemented(&'static str) and Transport(String) variants.
The heavyweight yellowstone-grpc-client (and its solana-sdk transitive) is
deliberately kept out of the core workspace so the deterministic crates stay
light and fast to build; it belongs in a deployment build.
Continue to 05: Control Plane.
03. Core Domain Model
lorenz-core, the shared deterministic vocabulary: domain newtypes, the Dex enum, strict typed EngineConfig, the error type, and the Hop / TradeRecord telemetry records.
05. Control Plane
lorenz-agent, the brain that governs but never executes: the risk manager and kill-switch, the parameter tuner, the append-only decision ledger, the LLM boundary, and the clamping orchestrator.