3DNews AI→ original

OpenAI introduces GPT-5.4 mini and nano for coding, sub-agents, and large-scale AI workloads

OpenAI launched GPT-5.4 mini and nano, compact versions of its flagship model for tasks where speed and cost are critical. Mini is more than twice as fast as…

AI-processed from 3DNews AI; edited by Hamidun News
OpenAI introduces GPT-5.4 mini and nano for coding, sub-agents, and large-scale AI workloads
Source: 3DNews AI. Collage: Hamidun News.
◐ Listen to article

OpenAI has released GPT-5.4 mini and GPT-5.4 nano — two compact versions of its flagship lineup, designed not for record-breaking reasoning, but for fast and large-scale work scenarios. The company is betting on models that are cheaper, faster, and still retain a significant portion of the capabilities of the full-size GPT-5.4.

Strengths of mini

GPT-5.4 mini is the senior of the two newcomers. According to OpenAI, it showed notable improvements over GPT-5 mini in programming, logic, tool use, and multimodal analysis, and became more than twice as fast.

On the SWE-Bench Pro benchmark, the model scored 54.4% versus 45.7% for GPT-5 mini, and in OSWorld-Verified, which tests the ability to work with interfaces via screenshots, it showed 72.

1% versus 42.0% for the previous mini version. This matters not just for pretty charts.

OpenAI directly positions GPT-5.4 mini as a working model for tasks where latency is noticeable to the user: code autocomplete and fixes, fast debugging cycles, subagents for auxiliary operations, and systems that read screenshots and interact with interfaces. The idea is simple: not every task needs to be sent to the most expensive model if a smaller version can handle it almost as well, but significantly faster.

Where nano comes in handy

GPT-5.4 nano is the smallest and cheapest model in the new lineup. OpenAI recommends it not as a universal chat engine, but as a utilitarian tool for simple, yet frequent operations.

These are scenarios where throughput matters more than reasoning depth: document stream parsing, classification, field extraction, result ranking, and support for simple code subtasks. Even nano scored 52.4% on SWE-Bench Pro, significantly outperforming GPT-5 mini.

Together, mini and nano fit well into an architecture where one large model plans the work, and several small ones execute it in parallel. In Codex, this is exactly the scenario OpenAI is pushing: GPT-5.4 can coordinate the process, while GPT-5.

4 mini takes on narrow tasks like searching a code base, reading large files, and processing documentation. This split approach helps keep both latency and budget under control.

  • Fast AI assistants for writing and editing code
  • Subagents for repository search and large file analysis
  • Tools that understand screenshots and manage interfaces
  • Classification, data extraction, and ranking in large-scale pipelines
  • More cost-effective auxiliary task execution without compromising overall system quality

Access and pricing

As of March 17, 2026, GPT-5.4 mini is available immediately in API, Codex, and ChatGPT. In the API, the model supports text, images, function calling, web search, file search, computer use, and skills.

The context window is 400,000 tokens, and the price is $0.75 per million input tokens and $4.50 per million output tokens.

For services with a high number of parallel requests, this is one of the main selling points. In Codex, mini uses only 30% of the full GPT-5.4 quota, so it can be used for cheap parallel subtasks.

In ChatGPT, the model has a more limited role: for Free and Go users, it's available through Thinking mode, and for others, it serves as a fallback when the GPT-5.4 Thinking limit is reached. GPT-5.

4 nano, meanwhile, is only available through the API and costs $0.20 per million input tokens and $1.25 per million output tokens.

What this means

OpenAI is increasingly building a lineup not around one "best" model, but around a set of roles: a large model thinks and coordinates, small ones quickly handle routine work. For developers and AI products, this is good news: the cost of agent systems can be reduced without a sharp drop in quality, especially where speed, parallelism, and high query volume matter.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…