AI Call Theft Is Now a Massive Business: How Vercel Is Stopping It

Hackers steal paid AI API calls, wrap them in OpenAI-compatible APIs and resell through proxies. Standard rate limits don't help. Vercel faced an attack of 1,300 requests per minute — without protection, this would have cost $10k per day. The company has revealed how this works.

Khamidun Zhemal

AI monitoring · Vercel Blog

Jun 1, 2026· 3 min

AI-processed from Vercel Blog; edited by Hamidun News

AI Call Theft Is Now a Massive Business: How Vercel Is Stopping It — Source: Vercel Blog. Collage: Hamidun News.

◐ Listen to article

Hackers have found a new way to monetize others' AI calls. They steal your paid requests to Claude, GPT, or Gemini, wrap them in a compatible API, and resell them through proxy networks — with zero costs for the inference itself.

Economics of Theft

A single prompt call to a frontier model might cost $2, while an HTTP request on Vercel costs $2 per million. AI inference is a million times more expensive, making it one of the most profitable goods to steal. The attacker pays zero, then resells tokens at a 10–20% discount from the original price — still a massive profit on zero costs.

Typical scenario: the attacker creates an OpenAI-compatible adapter that wraps your AI endpoint. Then they fan out requests through hundreds of residential proxy IPs and either release the finished SDK publicly or sell a subscription.

Examples exist: Chipotlai Max wraps the Chipotle chatbot
Openly asks for help porting to Home Depot, Lowe's, Target
The adapter acts as a session boundary for the attacker's downstream users

Why Rate Limits Don't Save You

Defenses like rate limits and auth walls were designed for attacks with entirely different economics — when the cost of bypass exceeded the profit. Here the profit is colossal: attackers buy residential proxies by the thousands and create fake accounts as needed. Rate limits get diluted across hundreds of IPs.

Classic vulnerability: you verify the user once per session, then send all requests to AI. The attacker intercepts the session and pushes thousands of stolen calls through it. By the time the request hits your API, it has already crossed your security boundary. Verification should work on every call, not per session.

Real Attack on Vercel

On April 29, 2026, traffic to Vercel's AI chat documentation jumped 10x — 1,300 requests per minute to the Claude Haiku 4.5 model. Without protection, this would have cost $10k+ per day. The company detected the mass theft through pattern monitoring and stopped the attack thanks to deep BotID analysis at the request level.

"If you have an AI endpoint on the internet, the risk of abuse is enormous and can easily lead to bills in the tens of thousands of dollars," says

Vercel.

What This Means

Inference theft is now a real threat to any company that has exposed an AI endpoint on the internet. Rate limits and basic auth are not enough. You need deep analysis verification on every request, not per session. For startups and SaaS companies, this means moving to per-request verification should happen not later, but now.

Hamidun News

AI news without noise. Daily editorial selection from 50+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Need AI working inside your business — not just in your newsfeed?

I build production AI for companies — custom CRM, internal tools, autonomous agents, workflow automation. Owned by you, shaped to your process, no per-seat tax. Built by Zhemal Khamidun, CPO of AlpinaGPT (AI platform, 6,000+ users).

Book a free consultation →