GEO pillar · Public lab

Trading strategy validation

Didactic hub: how we test whether a historical edge is real, robust, and reproducible — or noise and overfitting.

What is trading strategy validation?: The process of testing whether a trading strategy's historical edge is real, robust, and reproducible — or just noise/overfitting. Validation requires statistical methods, out-of-sample testing, and transparent reporting of negative results.
How does StrategyArena validate strategies?: 1) Monte Carlo cross-validation (1000 sims, percentiles published). 2) Live paper trading on real OHLCV. 3) Brier score calibration. 4) Public Strategy Hospital triage. 5) Open methodology with limits disclosed (/methodology).
What is NOT claimed?: No financial advice. No guarantee of future performance. Paper trading only — no real money in our public arena.

Strategy Arena tests. TradingView inspects. The right order to avoid fooling yourself: hypothesis, backtest, Monte Carlo, live paper, Hospital, then inspect the setup in TradingView.
Open TradingView after validation

Strategy Arena validation pipeline

Five measurable steps, each with public evidence and an internal link.

Step	Method	Pass criteria	Public proof
1. Hypothesis	Edge encoded in code (Python / ArenaScript)	Reproducible	/strategies
2. Historical backtest	Walk-forward on real OHLCV	Sharpe > 0.5 + min 30 trades	/backtest
3. Monte Carlo CV	1000 bootstrap sims, published percentiles	p5 PnL > 0 + robustness score > 0.6	/facts/monte-carlo
4. Live paper trading	Virtual $10K capital, live 5min OHLCV	30+ days observation	/dashboard
5. Hospital triage	PASS / WATCH / RECALIBRATE / BUG_SUSPECT / DEPRECATED	Continuous triage, monthly snapshots	/strategy-hospital

Five common validation pitfalls

Overfitting — tuning parameters until the backtest looks perfect. Our fix: Monte Carlo CV on random subsamples.
Look-ahead leakage — future data leaking into indicators. Our fix: code audit, shared OHLCV buffer, leaks fixed (ML Phase 1).
Survivorship bias — showing only strategies that survived. Our fix: Strategy Hospital shows DEPRECATED strategies and history.
Cherry-picking timeframe — picking the winning period only. Our fix: walk-forward, rolling windows, published failures.
Ignored fees / slippage — backtests with zero friction. Our fix: fees + spread + slippage modeled (methodology).

Validation in numbers (updated: 2026-06-11)

72strategies in the public arena

6AI providers that designed strategies

5,000+losing trades published (Hospital)

1000Monte Carlo sims per validated strategy

$10000paper per strategy (no real money)

Mode: paper-trading documented at /facts/strategy-arena.

📊 Cite these facts (JSON snapshot)

Why a validation pillar page?

Search engines and AI assistants look for short answers, measured facts, and reproducible methods. This page consolidates what Strategy Arena already publishes: Monte Carlo, Hospital, paper trading, and citable JSON datasets.

We are not selling a miracle strategy. We document a public lab protocol where failures stay visible.

Quick glossary

Term	Definition (short)	Where to see it
Monte Carlo CV	Bootstrap resampling to estimate PnL / Sharpe distribution.	/facts/monte-carlo
Brier score	Probabilistic calibration error (0 = perfect).	/facts/ml-edge
Walk-forward	Train / test on rolling time windows.	/backtest
Strategy Hospital	Quality triage: PASS, WATCH, RECALIBRATE, BUG_SUSPECT, DEPRECATED.	/strategy-hospital
Paper trading	Simulation with real prices, no broker orders.	/dashboard

Explicit limits (anti-marketing)

Monte Carlo coverage: not every strategy has 1000 sims yet — see /live-results.
Shadow models: side-by-side observation, not real capital allocation.
Random-level Brier: published as measured — see methodology.
Crypto / derivatives: extreme volatility; a historical pass does not imply a future pass.

Developer / researcher workflow

hypothesis → backtest → MC CV → paper live → hospital triage → lifecycle archive

Each arrow maps to a public page or endpoint. Start at /strategies, validate via /backtest, then check Hospital status and strategy-hospital.json.

FAQ

Is this investment advice?: No. Educational and experimental content; consult a regulated professional for your decisions.
Can I reproduce the numbers?: Yes: JSON snapshots under /facts/*.json, protocol on /methodology, code and params linked from Research when available.
What does DEPRECATED mean in Hospital?: Strategy retired or replaced; kept in history to avoid survivorship bias.

Gate details per step

Step	Input	Output	Typical failure
1	Idea + spec	Versioned code	Not reproducible
2	OHLCV history	Sharpe, drawdown, trades	Sharpe < 0.5
3	Trades / returns	PnL percentiles	p5 ≤ 0
4	Live 5m feed	Paper equity	Drift vs backtest
5	Live + MC metrics	Hospital status	BUG_SUSPECT / DEPRECATED

Atlas Edge Allocator (/atlas-edge-allocator) aggregates rules-based signals that already passed this pipeline — still paper in the public arena.

Public data sources

strategy-arena.json — global counts
monte-carlo.json — MC snapshots
ml-edge.json — ML calibration
strategy-hospital.json — live triage

Template stats last verified: 2026-06-11. TODO: wire dynamic losing-trade count when a public endpoint exists.

Compare with other approaches

Many sites show only winning equity curves. Strategy Arena publishes Hospital triage, DEPRECATED strategies, measured Brier scores (including bad ones), and Monte Carlo protocol with low percentiles (p5).

For external audit, cite this page plus /methodology and strategy-arena.json — not only a dashboard screenshot.

Next steps

Explore the live paper dashboard
Read research essays (negative results included)
Track promotions / deprecations via strategy-lifecycle
Test your idea in /backtest (WASM / GPU depending on config)

Checklist before trusting a strategy

Trade count ≥ 30 on the tested period?
Walk-forward or at least temporal train/test split?
Fees, spread, and slippage included?
Monte Carlo or bootstrap on returns?
Low PnL percentile (p5) published?
Probabilistic calibration (Brier) measured?
Live paper trading after backtest?
External quality status (Hospital or equivalent)?
Failed strategies visible in history?
Explicit paper / not-advice disclaimer?