AI Call Theft Is Now a Massive Business: How Vercel Is Stopping It
Hackers steal paid AI API calls, wrap them in OpenAI-compatible APIs and resell through proxies. Standard rate limits don't help. Vercel faced an attack of 1,30
AI-processed from Vercel Blog; edited by Hamidun News
Hackers have found a new way to monetize others' AI calls. They steal your paid requests to Claude, GPT, or Gemini, wrap them in a compatible API, and resell them through proxy networks — with zero costs for the inference itself.
Economics of Theft
A single prompt call to a frontier model might cost $2, while an HTTP request on Vercel costs $2 per million. AI inference is a million times more expensive, making it one of the most profitable goods to steal. The attacker pays zero, then resells tokens at a 10–20% discount from the original price — still a massive profit on zero costs.
Typical scenario: the attacker creates an OpenAI-compatible adapter that wraps your AI endpoint. Then they fan out requests through hundreds of residential proxy IPs and either release the finished SDK publicly or sell a subscription.
- Examples exist: Chipotlai Max wraps the Chipotle chatbot
- Openly asks for help porting to Home Depot, Lowe's, Target
- The adapter acts as a session boundary for the attacker's downstream users
Why Rate Limits Don't Save You
Defenses like rate limits and auth walls were designed for attacks with entirely different economics — when the cost of bypass exceeded the profit. Here the profit is colossal: attackers buy residential proxies by the thousands and create fake accounts as needed. Rate limits get diluted across hundreds of IPs.
Classic vulnerability: you verify the user once per session, then send all requests to AI. The attacker intercepts the session and pushes thousands of stolen calls through it. By the time the request hits your API, it has already crossed your security boundary. Verification should work on every call, not per session.
Real Attack on Vercel
On April 29, 2026, traffic to Vercel's AI chat documentation jumped 10x — 1,300 requests per minute to the Claude Haiku 4.5 model. Without protection, this would have cost $10k+ per day. The company detected the mass theft through pattern monitoring and stopped the attack thanks to deep BotID analysis at the request level.
"If you have an AI endpoint on the internet, the risk of abuse is enormous and can easily lead to bills in the tens of thousands of dollars," says
Vercel.
What This Means
Inference theft is now a real threat to any company that has exposed an AI endpoint on the internet. Rate limits and basic auth are not enough. You need deep analysis verification on every request, not per session. For startups and SaaS companies, this means moving to per-request verification should happen not later, but now.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.