AWS launched Lambda MicroVMs, a new serverless compute primitive that runs each user session or AI agent in its own Firecracker virtual machine with hardware-level isolation, snapshot-based rapid launch, and state preservation for up to eight hours. MicroVMs are available today in five regions on ARM64 with up to 16 vCPUs, 32 GB of memory, and 32 GB of disk per instance.
Lambda MicroVMs is a separate resource from Lambda Functions, with its own API surface. It targets a workload pattern Functions was never designed for: long-running, stateful, multi-tenant applications that execute code the developer did not write.
The AWS News Blog frames the problem bluntly:
Over the past few years, a new class of multi-tenant applications has emerged that all share the need to hand each end user their own dedicated execution environment in which to safely run code that the application developer did not write.
Until now, teams building these applications faced a three-way tradeoff. VMs deliver strong isolation but take minutes to start. Moreover, containers launch quickly; however, they share a kernel, which requires significant custom hardening for untrusted code. Furthermore, functions are optimized for event-driven request-response, not long-running interactive sessions that need to retain state. Lambda MicroVMs eliminates the tradeoff by combining all three: VM-level isolation, near-instant launch, and stateful execution in a single managed primitive.
The execution model works differently from Lambda Functions. You start by creating a MicroVM Image: upload a Dockerfile and code artifact to S3, and Lambda runs the Dockerfile, initializes the application, and snapshots the running memory and disk state via Firecracker. Every subsequent MicroVM launched from that image resumes from the pre-initialized snapshot rather than booting cold. Call run-microvm, pass the image ARN and an idle policy, and the service returns a dedicated HTTPS endpoint with the application already running. No load balancers, no networking setup, no infrastructure to manage.
The suspend/resume lifecycle is what makes this practical for interactive use cases. When a user walks away from a coding session, the MicroVM suspends after a configurable idle window, snapshotting memory and disk. When traffic returns, it resumes with everything intact: installed packages, loaded models, working filesets. DevelopersIO tested the full lifecycle with a Flask app in the Tokyo region and confirmed that suspend and resume preserved application state seamlessly. From the client side, the pause never happened.
The isolation guarantee is Firecracker's, the same lightweight VMM that powers over 15 trillion monthly Lambda Function invocations. Each MicroVM runs in its own dedicated VM with no shared kernel and no shared resources between sessions. A container escape in one session cannot reach another session or the host. For teams running AI-generated code at scale, where millions of executions per day come from models that cannot be audited, this is a materially stronger boundary than container-level isolation.
The cross-hyperscaler comparison is now complete for teams evaluating where to run agent-generated code. Cloudflare Sandboxes use container-based isolation distributed across their edge network, with V8 isolates for lighter workloads. Google's GKE Agent Sandbox uses gVisor kernel interception as a Kubernetes-native primitive. Azure Container Apps dynamic sessions use Hyper-V microVMs with hardware-level isolation. AWS Lambda MicroVMs use Firecracker with snapshot-based launch. Each makes a different tradeoff: Cloudflare optimizes for edge latency and global distribution, Google for Kubernetes-native portability, Azure for integration with its Logic Apps and Foundry agent platforms, and AWS for stateful isolation with suspend/resume lifecycle control.
On Reddit, practitioners balanced the launch excitement with pragmatism. One commenter challenged the framing that MicroVMs unlock something fundamentally new:
Lambda MicroVMs don't really enable entirely new workloads that were impossible before. What they change is the cost/performance/security tradeoff. The biggest use cases are untrusted code execution, multi-tenant SaaS, AI agents, and highly isolated serverless workloads where containers weren't considered secure enough but full VMs were too heavy. The unlock is VM-level isolation at near-serverless scale.
Networking supports inbound HTTPS on configurable ports with HTTP/2, gRPC, and WebSocket protocols. Outbound access is configurable for public internet or VPC connectivity. Authentication uses service-provided JWE tokens attached to requests via the X-aws-proxy-auth header.
Lambda MicroVMs complement rather than replace Lambda Functions. An application using Functions for its event-driven backbone can call into MicroVMs for the steps that need to run untrusted code in isolation. The two share the Lambda console and inherit the same operational model (CloudWatch logs, IAM roles, VPC integration) but serve fundamentally different workload patterns.
Lambda MicroVMs are currently available in US East (N. Virginia, Ohio), US West (Oregon), Europe (Ireland), and Asia Pacific (Tokyo). Pricing follows a baseline-plus-burst model: users pay for baseline compute while the MicroVM is running, and only for additional resources consumed when the workload exceeds the baseline. Suspended MicroVMs reduce to idle cost while preserving state. Another Reddit commenter ran the numbers and flagged the cost premium:
Min setup of 1 vCPU + 2 GB RAM will run you $3.03/day. This is 9x+ Fargate spot pricing.
The premium buys VM-level isolation and stateful suspend/resume, but teams should model their idle-to-active ratios carefully before committing.