LangSmith Released Sandboxes for Secure Execution of Coding Agents
LangSmith launched Sandboxes in general availability — an isolated environment based on kernel-isolated micro-VMs for secure code execution generated by AI agen

LangSmith, a platform for developing and debugging LLM applications from LangChain, announced the general availability release of Sandboxes — an isolated execution environment for AI agents that generate and execute code.
How Sandboxes Work
These are kernel-isolated micro-virtual machines that physically separate agent code execution from the main system. Each Sandbox runs in its own virtual machine, which prevents unauthorized access to sensitive data, host resources, and other processes. The main idea: agents often generate and execute code that you don't fully control. An LLM can write anything — from simple math to file deletion, data leaks, and infinite loops. Sandboxes solve this problem by giving agents a fenced "field" to experiment in without risking the main system. It's like giving a child a sandbox: they can dig, build, experiment, but they can't damage the house.
What They Can Do
LangSmith Sandboxes offer a range of practical features for various use cases:
- Snapshots — save the state of the environment at a specific point in time and instantly restore it for re-running or rolling back
- Parallel forks — run multiple independent sandbox instances simultaneously for parallel processing or A/B testing of agent logic
- Service URLs — provide web interfaces or API endpoints running inside the Sandbox so the agent can interact with external services
- Auth proxies — manage access, authentication, and authorization for external services called by the agent without exposing real credentials
All these features allow developers to safely run potentially dangerous code without worrying about the stability and security of the production environment. This is especially important for teams using agents to automate critical operations where failure could have serious consequences.
Who This Is Useful For
Sandboxes are intended for three main use cases. First — coding AI agents that write and execute code for data analysis, task automation, or report generation. Second — CI agents integrated into continuous integration and deployment pipelines for automatic testing and deployment. Third — complex data processing pipelines where code is generated dynamically and executed without direct human control.
Example: an agent receives the task "analyze this data and create a chart". It can write a Python script using pandas and matplotlib. LangSmith Sandbox will safely execute this script, not giving it access to the host file system or network unless explicitly allowed. Results will be returned in a safe format.
Another example: a CI agent can automatically run tests, deploy, and validate code in an isolated environment, ensuring that no generated code damages production or steals secrets (API keys, passwords, etc.).
Why This Is Needed Now
As AI agents become more autonomous and take on more responsibility, the risk of unforeseen behavior increases. LLM models sometimes "hallucinate" and generate incorrect code. For example, a model might accidentally write code that tries to read environment variables with secrets or call the wrong service. Human concerns about security when running LLM-generated code are completely justified. Security becomes critically important as AI agents transition from labs and hackathons to production systems serving real users and data. Running potentially hostile or unpredictable code is one of the main challenges when deploying autonomous systems. LangSmith Sandboxes show that the ecosystem of tools for AI development is maturing and preparing for enterprise security requirements.