Run Untrusted AI Agent Code Safely with Azure Container Apps Sandboxes

Microsoft has announced the public preview of Azure Container Apps Sandboxes. This new ARM resource type is Microsoft.App/SandboxGroups runs untrusted code generated by agents in hardware-isolated environments. Each sandbox starts from an OCI disk image in less than a second. It can scale to thousands of instances at once and incurs no cost when idle. This billing model suits the short, bursty tasks typical of agentic workloads.

The risk is not theoretical. When an LLM generates code and an agent executes it in-process, the execution surface becomes the attack surface. A planner in Python might seem safe, like fetching a remote URL, reading environment variables, or using exec(). But it can actually steal API keys or load any payload with just the standard library. Without a hard boundary between the agent's generated code and the host environment, any sufficiently capable model is one prompt injection away from a postmortem. Teams building multi-tenant platforms, CI/CD automation, or LLM-backed code interpreters often had to create custom isolation setups. They usually did this by layering container runtimes with limited seccomp profiles or by using dedicated Kubernetes clusters with Kata Containers. These solutions need ongoing operational investment.

Each sandbox operates in its own microVM, which is isolated from the host, the platform, and other sandboxes on the same infrastructure. Developers can use any OCI-compliant container image. Sandboxes handle provisioning from pre-warmed pools. They ensure multi-tenant isolation and manage the whole lifecycle, from startup to teardown. The resource model groups sandboxes into Sandbox Groups. These groups serve as the management and configuration boundary for a set of sandboxes. They are similar to Container Apps Environments but are designed for short-lived workloads. Each Sandbox Group holds shared settings that apply to all sandboxes within it. This includes network egress policy, managed identity assignment, lifecycle rules, and resource tiers.

The isolation model comes with key operational capabilities for production. Snapshot-based suspend and resume keep the full memory and disk state during sessions. This means an agent can pause a multi-step investigation or a development environment with installed packages, scale it to zero, and then resume later without re-initialisation.

Network egress defaults to deny. Outbound traffic is allowed only to explicitly allowed hosts, as enforced at the proxy layer inside the sandbox. Both system-assigned and user-assigned Entra managed identities are supported. This allows sandboxes to authenticate with Azure services without embedding credentials in the image or passing secrets through environment variables.

Microsoft has published an Agent Governance Toolkit. This toolkit works with ACA Sandboxes through the agt-sandbox Python package. It adds two layers of enforcement: AST scanning and Tool allowlists

These are applied before a snippet runs, and egress allowlist enforcement happens at the network boundary within the sandbox. They operate independently. This means that a denied call never reaches the execution environment. Also, if an outbound call is made to a non-allowlisted host, it fails at the proxy, no matter what the in-process policy allows.

The announcement clearly lists the products using this infrastructure: Cloud Sandboxes in GitHub Copilot, Foundry Hosted Agents, and Azure Container Apps Express. This is important for architecture. Instead of creating a new trust-based abstraction, Microsoft is providing access to the same isolation fabric it uses for its own developer products.

The ACA Sandboxes announcement comes into a crowded market. In April 2026, Cloudflare launched its Sandboxes product. It offers persistent, isolated Linux environments with active CPU pricing and snapshot-based session recovery. This targets teams already using Cloudflare Workers. E2B is a dedicated sandbox platform based on Firecracker microVMs. It has gained traction among many Fortune 500 companies for agent code execution. E2B boasts sub-200ms cold starts and BYOC options for teams needing data residency. Fly.io launched Sprites in January 2026. This product features a persistent-by-default model with Firecracker microVMs and 100GB NVMe storage. It challenges the ephemeral approach, arguing that rebuilding the state wastes latency.

What sets the ACA Sandboxes apart is their Azure-native integration, not just sandbox performance. Teams using Azure can access Entra identity, ARM-native resource management, and egress control through existing tools. They benefit from the same infrastructure that supports GitHub Copilot, all without managing the orchestration layer. For teams outside Azure, or those needing GPU-intensive workloads, BYOC options, or preferring open-source isolation tools, dedicated providers offer more flexibility.

About the Author

Claudio Masolo

Show moreShow less

InfoQ Software Architects' Newsletter

Write for InfoQ

About the Author

Claudio Masolo

Rate this Article

This content is in the DevOps topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter