Wildberries: how ML demand forecasting saved 12 billion rubles on returns
7-day forecast accuracy rose from 68% to 91% (MAPE). For electronics: 54% to 89%. Returns dropped 23% — customers get what they ordered because the item is in stock, and delivery takes 1.4 days vs 3.7. Warehouse turnover improved 35%: dead-stock capital fell from 23B to 14B rubles. Direct savings: 12 billion rubles annually — 7.3B from reduced return logistics + 4.7B from freed capital. Side effect: GMV grew 4.1% from pure availability (when items are in stock, conversion is higher). One key takeaway for the WB team: feature engineering matters more than architecture. The same CatBoost with the right features beat a custom transformer by 12pp accuracy.
Contexto
Wildberries is Russia's largest e-commerce platform: 280M SKUs in catalog, 30,000+ pickup points, 500,000 sellers. Logistics network includes 35 fulfillment centers totaling 4.7M square meters. Peak day turnover (Black Friday, 11/11): up to 22 billion rubles in 24 hours. Each return costs the platform 187 rubles on average: pick, return transport, inspection, restock.
Problema
The legacy forecasting system used simple exponential regression on historical sales. It worked in 2020 when growth was linear. By 2024, explosive growth in electronics and home categories aged the model in a week: an SKU that was niche three weeks ago became top-seller by Monday due to a viral Telegram video. Stockouts on popular items meant 8% of sessions ended with an empty cart; overstocking on slow movers froze 23 billion rubles in operating capital as dead stock at end of 2023.
Returns: 31% of clothing orders returned (no fitting room, wrong size). Each return crossed logistics twice — center to PUDO and back. If the model could forecast return probability, items could be pre-positioned for the return journey.
Solução
Wildberries ML team built a three-model ensemble. First: gradient boosting (CatBoost) on 217 features — sales history, seasonality, weather in delivery region, Telegram channel trends (a crawler monitors 1,200 channels), exchange rates, competitor prices, promotions, reviews. Second: a graph neural network on the product graph — "what's bought together" forecasts items with no sales history (cold start solved). Third: a temporal fusion transformer for long horizons (90 days ahead).
Key insight: a single "virality" feature — mentions in Telegram, VK, TikTok in the last 72 hours — added +7pp accuracy on home and electronics categories. The system runs on an in-house Kubernetes cluster, 4,800 vCPU; inference covers 12M SKUs × 35 centers every 4 hours.
The second layer is an allocation model: after each forecast, it decides where to physically store inventory. Logic: if Krasnodar demand is forecast, ship there; if return probability is 40%+, hold a buffer closer to the center. Uses an ABM-Q2 algorithm with a constraint solver on 38,000 warehouse-to-pickup-point edges.
Resultado
7-day forecast accuracy rose from 68% to 91% (MAPE). For electronics: 54% to 89%. Returns dropped 23% — customers get what they ordered because the item is in stock, and delivery takes 1.4 days vs 3.7. Warehouse turnover improved 35%: dead-stock capital fell from 23B to 14B rubles.
Direct savings: 12 billion rubles annually — 7.3B from reduced return logistics + 4.7B from freed capital. Side effect: GMV grew 4.1% from pure availability (when items are in stock, conversion is higher). One key takeaway for the WB team: feature engineering matters more than architecture. The same CatBoost with the right features beat a custom transformer by 12pp accuracy.
Lições aprendidas
- Feature engineering > architecture. One right feature (72-hr virality) beat switching from boosting to transformer.
- Cold start solved by graph, not classification. New SKUs look like old ones in the 'bought together' network.
- Return prediction is a separate model, not a byproduct of demand forecast. Different triggers.
- Continuous training every 4 hours beats weekly recompute with a bigger model.
- Allocation is a constraint solver, not ML. ML gives the forecast; OR-Tools gives the plan.