Vercel Blog→ original

AI Call Theft Is Now a Massive Business: How Vercel Is Stopping It

Hackers steal paid AI API calls, wrap them in OpenAI-compatible APIs and resell through proxies. Standard rate limits don't help. Vercel faced an attack of 1,30

AI-processed from Vercel Blog; edited by Hamidun News
AI Call Theft Is Now a Massive Business: How Vercel Is Stopping It
Source: Vercel Blog. Collage: Hamidun News.
◐ Listen to article

Hackers have found a new way to monetize others' AI calls. They steal your paid requests to Claude, GPT, or Gemini, wrap them in a compatible API, and resell them through proxy networks — with zero costs for the inference itself.

Economics of Theft

A single prompt call to a frontier model might cost $2, while an HTTP request on Vercel costs $2 per million. AI inference is a million times more expensive, making it one of the most profitable goods to steal. The attacker pays zero, then resells tokens at a 10–20% discount from the original price — still a massive profit on zero costs.

Typical scenario: the attacker creates an OpenAI-compatible adapter that wraps your AI endpoint. Then they fan out requests through hundreds of residential proxy IPs and either release the finished SDK publicly or sell a subscription.

  • Examples exist: Chipotlai Max wraps the Chipotle chatbot
  • Openly asks for help porting to Home Depot, Lowe's, Target
  • The adapter acts as a session boundary for the attacker's downstream users

Why Rate Limits Don't Save You

Defenses like rate limits and auth walls were designed for attacks with entirely different economics — when the cost of bypass exceeded the profit. Here the profit is colossal: attackers buy residential proxies by the thousands and create fake accounts as needed. Rate limits get diluted across hundreds of IPs.

Classic vulnerability: you verify the user once per session, then send all requests to AI. The attacker intercepts the session and pushes thousands of stolen calls through it. By the time the request hits your API, it has already crossed your security boundary. Verification should work on every call, not per session.

Real Attack on Vercel

On April 29, 2026, traffic to Vercel's AI chat documentation jumped 10x — 1,300 requests per minute to the Claude Haiku 4.5 model. Without protection, this would have cost $10k+ per day. The company detected the mass theft through pattern monitoring and stopped the attack thanks to deep BotID analysis at the request level.

"If you have an AI endpoint on the internet, the risk of abuse is enormous and can easily lead to bills in the tens of thousands of dollars," says

Vercel.

What This Means

Inference theft is now a real threat to any company that has exposed an AI endpoint on the internet. Rate limits and basic auth are not enough. You need deep analysis verification on every request, not per session. For startups and SaaS companies, this means moving to per-request verification should happen not later, but now.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…