# Why you can trust pomata Most libraries ask you to trust their output. `pomata` proves it. The premise is simple and a little obsessive: **every function is written twice.** The shipped version is tuned to vectorize in Polars; a second, deliberately naive *oracle* re-derives the same published formula and shares no code with it. The two must agree — on a fixed series, on frozen golden-master numbers, and on thousands of randomly fuzzed inputs — or the build is red. ## What every function survives The same four-tier ladder runs over all three families, under **100% branch coverage**: ```{list-table} :header-rows: 1 :widths: 20 80 * - Tier - What it proves * - **Contract** - The output's shape, dtype (`Float64`), and length; that it is a lazy `pl.Expr` (eager and lazy agree to the bit); and that under `.over(...)` each group is computed independently, never spanning a boundary. * - **Edge** - The exact warm-up length, and that a `null` and a `NaN` propagate exactly as documented — across the boundaries: an empty series, a single row, an all-`null` column, a window longer than the data, an interior gap. * - **Correctness** - Bit-level agreement with the independent oracle on a fixed, realistic series, plus frozen, hand-checked golden-master values. * - **Properties** - The same oracle agreement *and* the mathematical invariants (bounds, scale behavior, monotonicity) over the full `[-1e6, 1e6]` fuzz domain, with missing data freely interleaved — driven by Hypothesis. ``` ## The precision pomata guarantees The headline is **ten significant figures**: every indicator reproduces its oracle to a relative **`1e-10`** on any finite input within a sane dynamic range. That number is chosen, not guessed — and enforced: - **It dwarfs the data.** Market feeds carry four to eight significant figures. Ten figures means `pomata` never adds error you could observe — lossless against its own input, with orders of headroom to spare. - **It sits far above the noise floor.** A `float64` holds ~16 figures; honest rounding between the streaming implementation and the two-pass oracle lands around `1e-15`. `1e-10` is five orders above that — tight enough to reject any real coding error, loose enough that a last-bit difference never flakes the suite. - **It is verified, not asserted.** The property tier holds every indicator to `1e-10` across the full random domain. In practice the agreement is *far* tighter: about **half the outputs match the oracle to the last bit** (a relative difference of exactly zero), and the rest land at the noise floor — typically thirteen to fifteen figures, never fewer than the promised ten. :::{admonition} What pomata does *not* claim :class: warning Not the absence of all bugs, and not correctness outside the documented input domain. One known limit: a rolling-sum statistic loses precision only when an entire window collapses onto the float-precision floor of a much larger value that recently passed through it — it takes a deliberately adversarial series (a price dropping seven orders of magnitude bar-to-bar) to build that pattern. It is documented on the affected indicators, and the oscillators whose bound is unconditional — Chande Momentum Oscillator, Money Flow Index, Chaikin Money Flow — are clamped to that bound rather than papered over. ::: ## The receipt: indicators vs the public reference For indicators there is also a public yardstick — **TA-Lib**, the de-facto industry C implementation, a genuinely different computation path. Here is `rsi(14)`, the last value of a deterministic 400-bar series, to every digit a `float64` holds: ```text pomata 85.20908701341023 reference 85.20908701341023 ← independent reimplementation: identical, to the last bit TA-Lib 85.20908701341024 ← fifteen figures identical; differs only at the float64 floor ``` The same five indicators on the same series — most reproduce the reference *exactly*, the rest sit at the noise floor: ```{list-table} :header-rows: 1 :widths: 28 38 17 17 * - indicator - pomata - vs reimpl. - vs TA-Lib * - `sma(20)` - `105.15146076264764` - exact - `1e-13` * - `ema(20)` - `107.7299930892346` - `1e-13` - `1e-14` * - `rsi(14)` - `85.20908701341023` - exact - `1e-14` * - `atr(14)` - `1.904174462198776` - `9e-16` - `4e-15` * - `macd(12,26,9)` - `2.523444380829531` - `1e-13` - `1e-14` ``` The differential tier (non-gating) compares most indicators against TA-Lib **bar-for-bar, from the first defined value**. A documented minority is compared only on the converged tail — always with a reason its warm-up differs from TA-Lib's (Wilder's first True Range, the independent MACD/Chaikin EMAs, the Ehlers Hilbert pipeline's long warm-up, the Parabolic SAR cold start) — never a steady-state disagreement. ## PnL and metrics: proven at the edges, not in the digits Their math is simple; their correctness lives where the bad things happen, not in the fifteenth digit. The same machine applies — independent oracle, four-tier ladder, golden masters — but the characteristic invariants change: - **PnL** — cost monotonicity (more cost, less PnL); the additive-vs-compounded split (`cumulative_pnl` sums currency P&L, `equity_curve` compounds returns); no look-ahead (every bar uses only past data); and a defined, tested behavior for **every** degenerate input — `null`, `NaN`, `0`, `±inf`, warm-up. - **Metrics** — the annualization identities; scale-equivariance (a Sharpe Ratio is invariant to leverage); closed-form checks (the Sharpe Ratio of constant returns, the drawdown of a monotone series); and a defined behavior for every degenerate input — a `null` is skipped, a non-null `NaN` poisons the result, a degenerate denominator is reported as `±inf`/`NaN`, never silently clipped. ## The tolerance ladder Every band is named and lives in one file (`tests/support/tolerances.py`), sized to the worst residual the statistic's conditioning predicts — not a round number: ```{list-table} :header-rows: 1 :widths: 32 18 50 * - Band - Value - Where it applies * - exact - `1e-12` - recursive / stateless kernels that match the oracle to a few ULP * - reference / property - `1e-10` - oracle agreement on fixed and fuzzed inputs — the enforced guarantee * - scale - `1e-6` - degree-1 / degree-2 homogeneity under rescaling * - streaming - `1e-3` - a streaming statistic evaluated at an extreme magnitude ``` ## Re-run it yourself None of this is something you have to take on faith. The full gate runs from a clean clone in one command; the published figures regenerate from `scripts/precision_table.py`, and the realized headroom under the guarantee (it lands around `1e-14`) from `scripts/calibrate_tolerances.py`. The complete method — the derivations, the test sizing, exactly what is and is not claimed — is in [`CORRECTNESS.md`](https://github.com/ilpomo/pomata/blob/main/CORRECTNESS.md).