Methodology & Transparency - Strategy Arena Skip to main content
Public research

Methodology & Transparency

How Strategy Arena's ML and statistical layers actually work. We measured every Brier. We fixed every leak. Here's the real architecture.

Anti-marketing: if a layer is analytics, we call it analytics. If it is rules-based, we call it rules-based. If it is ML, we publish the real measured Brier.
New: Strategy Hospital publishes live strategy triage: healthy, drift, bug suspect, idle, or deprecated.

The 5 monsters: what they actually are

Monster Architecture Real metric Status
Invictus ML Ultimate LightGBM with isotonic calibration, OOF validation and monotonic constraints Brier OOS expected ~0.22 (calibrated) Real ML
Audited 8/10 by DeepSeek
Chimera Scanner + CNN 17 statistical patterns + PyTorch CNN, 108 OHLCV/pattern channels Brier OOS 0.2512
9,356 samples
Hybrid
Rules + real ML
Leviathan 9-Layer Ensemble 8 heuristic layers + 1 PyTorch MLP as Layer 9 Brier OOS 0.2589
10,758 samples, post-leak-fix
Hybrid
Heuristics + real ML
Hydra ML V5 + LSTM XGBoost ranking for PnL + PyTorch LSTM for direction Brier OOS 0.2480
51,718 samples
Real dual ML
Meta Intelligence v3 Strategy analytics: bootstrap CI, Bonferroni multi-compare, performance snapshots No prediction
Analytics engine
Honest dashboard
Brier > 0.25 = barely usable. Brier 0.25 is close to the practical ceiling for 5-minute crypto direction prediction. Our 3 ML models sit marginally around this ceiling: publishable, but not magic.

The methodology we use to validate strategies

Monte Carlo CV
30 random temporal splits, anchor between 20% and 70%.
Robustness gate
Sharpe_p5 > 0.5 on the 5th percentile of the 30 splits.
Trade count
n_trades_mean > 20 per OOS window, with at least 10 valid splits.

Strategies validated by Monte Carlo

Strategy Validated assets Best Sharpe_p5 Rejected on
Smart Money EvolvedBTC, ETH, SOL, BNB1.22 (BTC)-
Mean Rev Pro EvolvedNEAR, SNX, CHZ, TIA1.189 (SNX)TRB
Capitulation Rebound EvolvedBTC, SOL, BNB, NEAR, SNX, CHZ, TIA1.526 (SNX)-
Deep Freeze EvolvedSNX, CHZ0.884 (CHZ)BTC, ETH, SOL, BNB, NEAR, TIA, AVAX
Sly Fox EvolvedBNB0.5998 others
Deep Shadow EvolvedBTC0.8518 others
Wyckoff Evolvednone-PUMP, INJ, COMP, FLOKI
Darvasnone-BTC, ETH, SOL, BNB, TRB
MC validations are now tracked live, cell by cell, to measure drift between theoretical Sharpe_p5 and real performance.
View live Monte Carlo results

Data leaks we fixed

chimera_ml.py

Target leakage: avg_pnl was both feature and label source. Deleted on 2026-05-15.

leviathan_data_merger.py

3 look-ahead bugs: future news, regime using current bar, future one-hot. Fixed on 2026-05-15.

Honest consequence: Leviathan NN's Brier moved from 0.244 with leakage to 0.2589 without leakage. We publish the real number.

Why some "AI strategies" are not real ML

What we are not claiming

We are not claiming to reliably predict crypto direction.
We are not claiming Brier < 0.20. That would be suspicious for this framing.
We are not claiming returns above 1-3 Sharpe without long validation.
We are not claiming a single magic unified AI brain.
What we claim: a transparent lab that measures everything, publicly fixes leaks, and refuses to publish as "edge" what does not survive strict Monte Carlo CV validation.
 Rejoindre le canal 💬 Feedback