OpenAI adds sandbox execution to Agents SDK for control and secure AI agent execution
OpenAI updated Agents SDK and added a native sandbox for isolated execution of agent tasks. Enterprise teams can run workflows with files, shell commands…
AI-processed from AI News; edited by Hamidun News
OpenAI is trying to close one of the most painful gaps in corporate AI agents: how to give a model access to files, code, and the work environment without turning it into an uncontrolled risk for the company. A new update to the Agents SDK adds native sandbox execution — an isolated execution layer where an agent can read and write files, run commands, and work with dependencies without getting direct access to sensitive infrastructure. For teams that have already moved beyond demos and are trying to run agentic workflows in production, this is not a cosmetic update, but an attempt to solve three problems at once: governance, reliability, and cost of long runs.
Until now, developers faced an unpleasant choice. Model-agnostic frameworks gave freedom and allowed not to be tied to a single vendor, but often did not fully unlock the capabilities of frontier models. SDKs from the providers themselves were closer to the model, but did not always provide the necessary transparency over the control loop.
And managed agent APIs simplified deployment, but rigidly limited where exactly the agent works and how it gets access to corporate data. OpenAI is trying to remove this compromise through a model-native harness — a control layer that is better aligned with how models actually work in long multi-step tasks. The updated SDK includes configurable memory, orchestration with sandbox awareness, file system tools in the spirit of Codex, MCP support, AGENTS.
md, shell and apply patch. A practical case study already exists at Oscar Health. The company tested the new infrastructure on a clinical workflow where it needs to parse long medical records, extract correct metadata, and more importantly, correctly identify the boundaries of individual treatment episodes within complex documents.
According to the Oscar team, previous approaches did not provide sufficient reliability for production. The new Agents SDK made such a scenario viable: the system parses patient history faster, and staff get a clearer picture of a specific visit. To integrate with the corporate environment, OpenAI also added the Manifest abstraction.
It describes the agent's workspace: which local files can be mounted, where to write results, and from which storage systems to pull data. AWS S3, Azure Blob Storage, Google Cloud Storage, and Cloudflare R2 are supported. The main focus of the update is security.
OpenAI assumes that any agent that reads external content or executes generated code will eventually encounter prompt injection and data exfiltration attempts. Therefore, the company separates the control harness and compute layer. Credentials and the main control loop remain outside the environment where code created by the model is executed.
This reduces the chance that a malicious instruction inside a document or request can reach keys, internal APIs, or neighboring systems. The problem of reliability in long runs is also addressed separately: if a container crashes on the nineteenth step out of twenty, the run doesn't need to start from zero. The SDK saves state in an external layer, can create snapshots and perform rehydration, and then continue the task from the last checkpoint in a new container.
This architecture simultaneously reduces cloud costs and simplifies scaling: runs can be distributed across multiple sandboxes, subagents can be isolated, and workload can be parallelized. For the market, this is an important shift: OpenAI is promoting not just another SDK for agents, but a more complete infrastructure standard for corporate deployment of agentic systems. If the approach takes hold, the company will strengthen its position not only at the model level, but also at the operational layer, where decisions are made about security, audit, and implementation costs.
The new capabilities are already available through the API under the standard pricing model, first for Python, and TypeScript support is announced for later. The next logical step looks like code mode, subagents, and tighter integration with companies' internal tools — that is, everything needed to turn an agent from an experiment into a managed workflow.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.