DiffQuant optimizes the Sharpe ratio directly through a differentiable trading simulator
Most ML models for trading learn to minimize MSE but are evaluated by the Sharpe ratio — those are different tasks. DiffQuant removes that gap: the entire…
AI-processed from Habr AI; edited by Hamidun News
DiffQuant — an open-source prototype that eliminates a fundamental contradiction in ML trading: models learn to minimize mean squared error, but are evaluated by the Sharpe ratio. The authors closed this gap by making the entire trading pipeline — from features to PnL and commissions — a single differentiable computational graph.
Surrogate Goals Problem
In most ML systems for quantitative trading, the scheme looks like this: train a neural network to predict returns or price direction, minimizing MSE or BCE. Then, on top of these predictions, build a trading strategy and evaluate it by the Sharpe ratio — the ratio of average returns to volatility. The problem: these two objectives are mathematically unrelated.
Better MSE does not guarantee better Sharpe. In practice, the neural network spends resources reducing prediction error in market regimes where this has no impact on the final trading result. A 15% improvement in forecast accuracy may provide no Sharpe gain — and this is a documented problem both in academic papers and among practitioners in the quant industry.
Partial solutions — ranking loss functions, custom proxy-metrics, post-hoc weighting — do not address the core issue: the gradient during training does not see the actual trading mechanics.
How the Differentiable Simulator Works
DiffQuant solves the problem directly: the entire trading pipeline is implemented as a single computational graph with continuous operations:
- Market features → neural network signal prediction block
- Signal → target position accounting for size and direction constraints
- Position → step-by-step PnL with explicit modeling of slippage and commissions
- Accumulated PnL → Sharpe ratio as a differentiable scalar loss function
The key technical challenge is making positioning and costs differentiable, since real trading operations are discrete. The authors use soft approximations: instead of sharp transitions between positions — continuous functions precise enough for gradient flow.
"This is not a ready-made trading system — this is a different problem
formulation," the authors emphasize.
As a result: the gradient with respect to the Sharpe ratio propagates backward through the entire pipeline to the neural network weights. The model trains directly on the criterion by which it will be evaluated in production.
Sharpe +1.73 and +1.15 After Commissions
The prototype was tested on two consecutive held-out quarters — periods the model had not seen during training or hyperparameter tuning. Sharpe +1.73 on the first quarter and +1.15 on the second after accounting for real commissions. Both values exceed one — the accepted baseline benchmark for algorithmic strategies. The code, data, and full experiment protocol have been published open-source. Anyone with access to similar market data can reproduce the results. The authors deliberately avoided complexity — no exotic architectures or non-standard features: just a change in loss function.
What This Means
DiffQuant demonstrates that correct problem formulation matters more than architecture choice. If a strategy in production is evaluated by Sharpe — optimization during training should target precisely that, not surrogates. For quant funds and independent researchers, this is a practical signal: the gap between training objective and real metric can be closed technically — and this changes not only the result, but also what the model actually learns.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.