DiffQuant optimizes the Sharpe ratio directly through a differentiable trading simulator

Q: What is the source?

Originally published on Habr AI. Hamidun News processes and adapts the material with AI.

Q: When was it published?

May 2, 2026. Reading time: 3 min.

Most ML models for trading learn to minimize MSE but are evaluated by the Sharpe ratio — those are different tasks. DiffQuant removes that gap: the entire…

Hamidun News Editorial

AI monitoring · Habr AI

May 2, 2026· 3 min

AI-processed from Habr AI; edited by Hamidun News

DiffQuant optimizes the Sharpe ratio directly through a differentiable trading simulator — Source: Habr AI. Collage: Hamidun News.

◐ Listen to article

DiffQuant — an open-source prototype that eliminates a fundamental contradiction in ML trading: models learn to minimize mean squared error, but are evaluated by the Sharpe ratio. The authors closed this gap by making the entire trading pipeline — from features to PnL and commissions — a single differentiable computational graph.

Surrogate Goals Problem

In most ML systems for quantitative trading, the scheme looks like this: train a neural network to predict returns or price direction, minimizing MSE or BCE. Then, on top of these predictions, build a trading strategy and evaluate it by the Sharpe ratio — the ratio of average returns to volatility. The problem: these two objectives are mathematically unrelated.

Better MSE does not guarantee better Sharpe. In practice, the neural network spends resources reducing prediction error in market regimes where this has no impact on the final trading result. A 15% improvement in forecast accuracy may provide no Sharpe gain — and this is a documented problem both in academic papers and among practitioners in the quant industry.

Partial solutions — ranking loss functions, custom proxy-metrics, post-hoc weighting — do not address the core issue: the gradient during training does not see the actual trading mechanics.

How the Differentiable Simulator Works

DiffQuant solves the problem directly: the entire trading pipeline is implemented as a single computational graph with continuous operations:

Market features → neural network signal prediction block
Signal → target position accounting for size and direction constraints
Position → step-by-step PnL with explicit modeling of slippage and commissions
Accumulated PnL → Sharpe ratio as a differentiable scalar loss function

The key technical challenge is making positioning and costs differentiable, since real trading operations are discrete. The authors use soft approximations: instead of sharp transitions between positions — continuous functions precise enough for gradient flow.

"This is not a ready-made trading system — this is a different problem

formulation," the authors emphasize.

As a result: the gradient with respect to the Sharpe ratio propagates backward through the entire pipeline to the neural network weights. The model trains directly on the criterion by which it will be evaluated in production.

Sharpe +1.73 and +1.15 After Commissions

The prototype was tested on two consecutive held-out quarters — periods the model had not seen during training or hyperparameter tuning. Sharpe +1.73 on the first quarter and +1.15 on the second after accounting for real commissions. Both values exceed one — the accepted baseline benchmark for algorithmic strategies. The code, data, and full experiment protocol have been published open-source. Anyone with access to similar market data can reproduce the results. The authors deliberately avoided complexity — no exotic architectures or non-standard features: just a change in loss function.

What This Means

DiffQuant demonstrates that correct problem formulation matters more than architecture choice. If a strategy in production is evaluated by Sharpe — optimization during training should target precisely that, not surrogates. For quant funds and independent researchers, this is a practical signal: the gap between training objective and real metric can be closed technically — and this changes not only the result, but also what the model actually learns.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation