OpenAI Blog→ оригинал

OpenAI: how large companies scale AI through trust, control, and quality

OpenAI has released a guide to scaling AI in large companies and compiled case studies from Philips, BBVA, Mirakl, Scout24, JetBrains, and Scania. The main take

OpenAI: how large companies scale AI through trust, control, and quality
Source: OpenAI Blog. Коллаж: Hamidun News.
◐ Слушать статью

OpenAI released a brief guide on how large companies transition AI from pilot mode to operational infrastructure. The key takeaway: scaling depends not on access to powerful models, but on trust, governance, process redesign, and clear quality standards.

Not About Launching a Model

In their new material, OpenAI gathered interviews with executives from Philips, BBVA, Mirakl, Scout24, JetBrains, and Scania. Despite differences in industries — from healthcare and banking to e-commerce and software development — they faced nearly identical challenges. The most notable: the gap between what modern models can already do and what companies can realistically and safely deploy to production.

OpenAI calls this the capability gap: pilots don't scale, solutions layer on top of old processes, and experiments don't convert into operational impact. Hence the key shift in approach. Leading companies don't view AI as another software upgrade or centralized top-down rollout.

They treat it as a new operational layer: first build trust, bring in security, legal, and IT at the design stage, then expand scope. What matters isn't speed for speed's sake, but the ability to implement AI so employees understand boundaries, see daily value, and don't lose quality where errors carry high costs.

Five Conditions for Scale

OpenAI identifies five recurring patterns that helped companies move from experiments to sustained impact. This isn't a feature list or a choice of specific model, but rather a governance framework that converges across different cases. At the center: culture, process ownership, and willingness to delay launch if quality hasn't reached the threshold.

  • Culture before tools: training and safe experiments matter more than mass license purchases.
  • Governance as an accelerator: security, legal, compliance, and IT join at the start, not at the end.
  • Ownership over consumption: teams don't just use AI, they redesign their workflows around it.
  • Quality before scale: success criteria are defined upfront and tested before launch.
  • Protecting expert judgment: AI amplifies review, reasoning, and decision-making, not just throughput.

It's clear this isn't about automation at any cost. Scout24, for example, when launching conversational real estate search, bet on its own test frameworks inspired by OpenAI Evals, and delayed releases if the system didn't meet quality thresholds. For companies in regulated and sensitive sectors, this is almost mandatory: trust can't be "added later" after a failed launch.

"Defining what 'good' means before scaling AI is critical: it's

quality that turns an experiment into something users actually trust."

Company Cases

The most instructive example of organizational approach is Philips. This wasn't about a niche team of enthusiasts: the company was trying to embed AI into the daily work of roughly 70,000 employees across healthcare and technology. Rather than positioning AI as a specialized skill, leadership began with AI literacy and user confidence, training senior leaders first.

At BBVA, similar logic worked through governance: an internal assistant in Peru used by over 3,000 employees cut average request processing time from 7.5 minutes to roughly one, and then the approach scaled further: ChatGPT Enterprise in the bank is now deployed across more than 120,000 employees worldwide. Mirakl went further still, giving teams the ability to assemble agents themselves and reshape processes.

The result: 70% faster internal technical documentation creation, 37% improvement in support efficiency while maintaining 96% customer satisfaction, and 91% faster catalog onboarding with roughly half the error rate. JetBrains, meanwhile, focuses not on code generation volume, but on hybrid scenarios where AI helps developers analyze, review, and design. And Scania embeds AI directly into operational workflows — from engineering tasks to service — instead of keeping it as a separate tool "on the side."

What This Means

For the market, this is an important signal: the era of personal productivity experiments is giving way to AI embedded within end-to-end business processes and agent scenarios under human control. Winners won't be those who bought access to a model earliest, but those who learned to design workflows, measure quality, and distribute responsibility for results.

ЖХ
Hamidun News
AI‑новости без шума. Ежедневный редакторский отбор из 400+ источников. Продукт Жемала Хамидуна, Head of AI в Alpina Digital.
What do you think?
Loading comments…