Deep Q-Network — deep reinforcement learning. Learns to trade by trial-and-error like a video game player.
DQN (Deep Q-Network) is a reinforcement learning (RL) agent that learns to trade cryptocurrencies by directly interacting with the market environment — like a video game player learning through trial and error. The DQN agent doesn't follow predefined rules: it discovers its own optimal strategy by maximizing a reward (risk-adjusted profit). It's the only ML arena model that learns a complete action policy (when to buy, sell, how much) rather than just price direction.
Agent observes a market state (20 features: price, RSI, MACD, volume, current position, ongoing PnL). Chooses from 5 actions: buy 25%, buy 50%, hold, sell 25%, sell 50%. Receives reward = incremental Sharpe ratio change. Neural network (3 layers, 256-128-64) learns to predict Q-value (expected future value) of each action. Experience replay buffer of 100K transitions. Epsilon-greedy exploration (5%).
Q-values for each action (buy/sell/hold). Policy learned by market interaction. 20 state features (price, indicators, position). Experience replay (100K transition memory). Epsilon-greedy exploration (5% random trades to discover new strategies).
High
Learns its own optimal strategy (no human-imposed rules). Directly optimizes trading (actions) not just price prediction. Adapts to changing market conditions. Discovers action patterns that humans don't find.
RL training instability (convergence not guaranteed). Sometimes erratic behavior (black box). Requires enormous amounts of interaction data. Non-stationary market environment complicates learning. Random exploration (5%) costs money.
Explore all 74 trading strategies across 4 arenas
🏟️ View all strategies