Developer Platform Unkey has written about rebuilding its entire API authentication service from the ground up, moving from serverless Cloudflare Workers to stateful Go servers after re-evaluating the constraints of their serverless architecture. The company states that the move resulted in a sixfold performance improvement and eliminated the workarounds that had become a dominant part of its engineering efforts.
Unkey's co-founder Andreas Thomas explained that the decision came down to latency, explaining that when a service sits in the request path for thousands of applications, "every millisecond matters." Unkey discovered that the root problem was caching. Cloudflare's cache was taking more than 30 milliseconds at the 99th percentile, which wasn't fast enough to meet their goal of responding in under 10 milliseconds total.

With serverless functions being stateless by design, they spin up, handle a request, and disappear. That means any cached data must reside elsewhere and be retrieved over the network. Unkey attempted various mitigations for this, including building a multi-tier caching system utilising various Cloudflare services. They optimised cache keys and tuned expiry times, but nothing got around the basic physics of the situation. Thomas put it simply: zero network requests are always faster than one network request. No amount of clever caching could match what a stateful server does by default, which is to keep hot data in memory.
Unkey also had to extract event data from its serverless functions for analytics and logging. A typical design pattern for a standard server might be to batch events in memory and flush them periodically, typically every few seconds. But in serverless, the function might vanish the moment it finishes, so you have to flush on every invocation.
That led Unkey to build a custom Go proxy called chproxy, designed to buffer analytics events before sending them to their analytics service, ClickHouse, which performs poorly with thousands of small inserts. The team also built a separate pipeline for metrics and logs, routing them through intermediate Cloudflare Workers that parsed and split events before forwarding them on.
Unkey found that they were using Durable Objects, Logstreams, Queues, Workflows, and several custom stateful services. Thomas said the team spent more time evaluating and integrating new SaaS products than building features, so they were solving problems that the serverless architecture had created, not problems their customers had. The new system runs on AWS Fargate with Global Accelerator in front. It still distributes traffic globally, but now long-lived Go processes can hold data in memory and naturally batch events. Thomas explains that all the auxiliary services supporting the serverless architecture were no longer needed, so the code became simpler, and Unkey's bill decreased.
Zero network requests are always faster than one network request. No amount of stacking external caches could get us around this fundamental limitation.
— Andreas Thomas, Unkey co-founder
Thomas stressed that the Cloudflare Workers service itself was stable, and the platform worked as advertised. The problem was what he called the complexity tax of working around statelessness. Unkey had to build and maintain infrastructure that replicated what any stateful application gets for free.
The move also unlocked features that had previously been inaccessible. Self-hosting for Unkey's customers was nearly impossible with Cloudflare Workers. The serverless runtime is open source in theory, but getting it to run locally is non-trivial. With a monolithic Go application, Unkey's customers can spin up the product with a single Docker command, making it not only more straightforward but also flexible enough to meet strict data residency requirements that enterprises might have. Engineers can now run the whole stack on their laptops in seconds, making local development easier as they don't have to work around Cloudflare-specific APIs. Unkey is planning to launch a deployment platform next year that will let customers run the service wherever they want, adding portability and simplicity to their product's selling points.
The shift comes at a time when other high-profile companies are also questioning whether serverless computing delivers on its promises for demanding production workloads. In a notable migration, Amazon Prime Video consolidated its video quality monitoring from a distributed, serverless system into a single process, reducing its infrastructure costs by more than 90 per cent. That case study caused a stir because it came from Amazon itself, a company that helped popularise microservices through its serverless product AWS Lambda. This architecture mirrored Unkey's in many ways, featuring high-volume workloads that required tight coupling between components. As the service grew, the pay-per-invocation model of serverless stopped making economic sense, and the stateless architecture overcomplicated simple operations.
On LinkedIn, serverless consultant Yan Cui responded to Unkey's announcement by asking when serverless stops helping, positing that an architecture suited to early growth can actually become a constraint later. In Unkey's case, they hit limits on caching and latency that were inherent to the Cloudflare Workers platform. Engineer Luca Maraschi was more blunt, suggesting that teams examine their traffic patterns closely and not assume that edge or serverless functions are the best fit for every workload. Other community commentary has been more sympathetic, with DevOps Consultant Max Hayward from Ten10 acknowledging internal pressure to build a distributed architecture, while also admitting that a more straightforward approach would have been better in hindsight.

In summary, Unkey's story shows that serverless remains a valuable architectural pattern. It works well with workloads that are event-driven or actually intermittent. Those services with spiky traffic that don't require maintaining a persistent state can see significant cost savings. However, Unkey and Amazon Prime Video's experience suggests that this stateless model may not be the best for high-throughput services with strict latency requirements.