OpenAI Blog→ original

Parloa launched voice AI agents to support large companies using OpenAI models

Parloa explained how it uses OpenAI models for voice AI agents in customer support at large companies. The AMP platform gives business teams a no-code way to…

AI-processed from OpenAI Blog; edited by Hamidun News
Parloa launched voice AI agents to support large companies using OpenAI models
Source: OpenAI Blog. Collage: Hamidun News.
◐ Listen to article

Parloa, a Berlin-based developer of customer service platforms, shared how it uses OpenAI models to launch voice AI agents at large enterprises. The AMP platform does more than just answer calls—it helps design, test, and deploy systems that must work reliably in real-time mode.

How AMP Works

Parloa's story began with a rather practical challenge. One of the company's co-founders, Stefan Ostwald, spent a day at an insurance call center and saw how employees repeatedly handled identical requests: password resets, policy questions, routine account changes. At first, the company built rule-based voice bots, but with the advent of ChatGPT and new OpenAI models, it shifted to an AI Agent Management Platform, or AMP. Now the focus is no longer on rigid predefined scenarios, but on a platform where companies can build, test, and deploy voice services based on LLM.

The key idea of AMP is that it can be used not only by developers. Business teams or subject matter experts define the agent's role, instructions, constraints, and connected tools in plain language, without intent trees and without manually describing each step. The system can then be run through simulation: one model plays the customer, another plays the configured agent. Teams see how the agent responds, whether it correctly calls APIs, and whether it stays within the scenario bounds. They can quickly adjust the configuration before any real calls.

Betting on Evaluation

Parloa makes a strong bet on an evaluation-first approach. For enterprise customers, beautiful demos aren't enough—they need predictability in production, because switching to a new model always involves costs and risks. So the company doesn't take abstract benchmarks at face value. Instead, it builds its own test sets that mirror real customer support scenarios. These measure how well the model follows instructions, how reliably it calls tools, what the response latency is, and how the system handles edge cases.

"Models only matter when they work in production," that's how

Parloa explains its approach to real-time voice systems.

If a model shows good results on paper, that's not enough. Only configurations that consistently pass simulations and automated checks are sent to production. The platform combines LLM-as-a-judge with deterministic rules: some evaluations check response quality and instruction adherence, while others ensure that critical steps happen in the right order. This approach is already delivering business results: in one deployment, a global travel company reduced the number of escalations to live operators by 80%.

Voice Without Pauses

For Parloa, voice interface is a distinct engineering challenge. Unlike text chat, every second is directly felt by the user. The entire pipeline must work with minimal latency: the system first recognizes speech, then the model generates the response, then voice synthesis kicks in. Even a small pause at the model layer becomes noticeable silence on the call, so Parloa works with OpenAI to optimize not just response quality but also speed, robustness, and instruction adherence.

  • Speech recognition is checked by word error rate, especially on sensitive data like policy numbers and account identifiers.
  • Speech synthesis is evaluated through blind listening tests to understand how natural the voice sounds to real people.
  • Speech-to-speech models are separately tested for production readiness in terms of latency, accuracy, and cost.
  • Multilingual benchmarks are run across different markets, because enterprise customers need equal reliability not in one country but globally.

Today, Parloa's agents handle millions of conversations in retail, travel, and insurance. The company looks beyond just phone calls: a single support scenario can start on the phone, continue in chat, and include links or interactive elements as the conversation unfolds. In this approach, channels no longer operate in isolation. For the customer, this should be one seamless dialogue, not a collection of fragmented touchpoints, and that's the model Parloa is building its platform around.

What This Means

Parloa's story shows that the enterprise support market is shifting away from simple IVR trees toward full-fledged AI agent management platforms. Winners here won't be those with the loudest model, but those who can validate performance against real scenarios, maintain low latency, and safely integrate with internal business systems.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Need AI working inside your business — not just in your newsfeed?

I build production AI for companies — custom CRM, internal tools, autonomous agents, workflow automation. Owned by you, shaped to your process, no per-seat tax. Built by Zhemal Khamidun, CPO of AlpinaGPT (AI platform, 6,000+ users).

What do you think?
Loading comments…