Key Takeaways
- Latency is the biggest reason that a general purpose database isn’t a good fit for edge architectures.
- CRDTs, or conflict-free replicated data types, are a relatively novel state primitive that give us a way to skirt around a lot of the complexity around consistency.
- Similar to how CRDTs eliminate entire classes of complexity related to fault handling, using synchronous remote procedure calls for synchronization eliminate entire classes of complexity related to queueing theory.
- Typical RDBMS systems and common distributed systems techniques are incredibly useful and productive up to a certain scale, and after that, the complexity becomes unjustifiable.
At this year’s QCon London, Fastly’s Peter Bourgon did a well-received talk about the challenges of state management in distributed systems. Specifically, he talked about how an architecture and communication model for a global-scale edge platform.
It was a talk that addressed a wide-range of topics. Is there a central source of truth for the data? How should data be synchronized across the system? What’s the right data structure to store state? InfoQ reached out to Peter to further explore a handful of areas that we think are interesting to our readers.
InfoQ: Do you think "edge" is the next widely adopted architecture for modern systems? Or is it an important, but small niche suitable for a specific subset of systems? And if so, which ones?
Bourgon: I don’t think that “edge” is an architecture in itself, but rather it’s a component of an architecture. Users expect applications to be fast, and the speed of light imposes some unavoidable physical constraints on how we can meet those expectations. Past a certain point, the only way to decrease latency and improve experiences is to move your application, in whole or in part, physically closer to your users. And edge platforms are the way to do that.
It’s true that not all systems will take the same value from extending their state and logic out of the datacenter. There’s definitely some re-thinking, re-factoring, and re-architecting work involved, and that’s always an engineering decision of costs and benefits. But, looking to the future, I think that the role of the edge in system design is going to get bigger and more important.
InfoQ: You say in your talk that a "general purpose database" isn't a fit for stateful edge systems. Can you explain a bit why someone wouldn't want a single centralized database for an edge system? Is it entirely about latency?
Bourgon: I think latency is the biggest reason. If you’re extending your system out of the datacenter and toward the edge, at least part of the reason is presumably that the round-trip costs are too high otherwise. Users experience those costs with static assets, with application logic, and also with state, so if your transactions always go back to your origin, you’re not taking full advantage of the architecture.
But there’s another part, too, which is related to consistency. With a centralized database, it’s relatively easy to express transactions against a single, logical, coherent global “truth”. Because all the parts of the system aren’t separated by very much distance, you can perform the necessary communication for that transaction quickly, and stay within latency budgets. But if you balkanize your state all over the world, making a traditional transaction becomes cost-prohibitive. You have to allow users to manipulate state locally, without establishing global consensus, and this means opting in to non-traditional, typically eventually consistent, data systems.
InfoQ: In your talk, you go into some depth on conflict-free replicated data types (CRDTs). Can you explain more about these, why they make sense for a stateful edge architecture, and how/what data is stored?
Bourgon: Arguably the hardest part of distributed systems is dealing with faults. Computers are ephemeral, networks are unreliable, topologies change — the fallacies of distributed computing are well-known, and accommodating them tends to dominate the engineering effort of successful systems. And if your system is managing state, things get much more difficult: maintaining a useful consistency model for users requires extremely careful coordination, with stronger consistency typically demanding commensurate effort. This inevitably corresponds to more bugs and less reliability.
CRDTs, or conflict-free replicated data types, are a relatively novel state primitive that give us a way to skirt around a lot of this complexity. I think of them as carefully constructed data types, each combined with a specific set of operations. Over-simplifying, if you make sure the operations are associative, commutative, and idempotent, then CRDTs allow you to apply them in any order, including with duplicates, and get the same, deterministic results at the end. Said another way, CRDTs have built-in conflict resolution, so you don’t have to do that messy work in your application. Formally, they exhibit something called strong eventual consistency.
This property, by itself, means any system built with CRDTs can have absolutely trivial fault management. Try to send the operations, in any order and without any coordination, to the nodes that need them. If there’s a problem, just try again later. That’s it. As long as the operations eventually get where they need to go, the system is guaranteed to be correct. By choosing a smarter state primitive, we can build a much simpler and more reliable system.
CRDTs enable a lot of cool use cases, like offline-first document editing that can sync up automatically. In the context of edge state, they let us keep our transactions local to the point of presence, to meet our latency requirements, while still being able to share state globally.
Of course these benefits don’t come for free. For one thing, it’s not always easy, or even obvious, how to model your data as CRDTs. Simple-seeming operations like delete can have fiendishly complex CRDT equivalents. Also, the sometimes messy details of multiple parallel versions of state tend to surface in the APIs of these systems, and applications have to adapt to deal with them, which isn’t always easy. Most significantly, a single byte of usable, logical state in a CRDT requires many more bytes of actual memory, which can quickly render the economics of these systems infeasible.
Like any technology put to productive use, engineering compromises, trade-offs, and optimizations are required to make CRDTs viable.
InfoQ: You made the argument that synchronous calls for state synchronization might be the better option than an asynchronous, event-driven one. Why is that?
Bourgon: This was sort of a minor point, but an important one, related to the implementation of distributed systems. It’s essentially impossible to build confidence in the safety and reliability of large-scale systems like this without being able to run the system under deterministic test, or simulation. And it’s essentially impossible to simulate a system unless each component can be modeled as a plain, deterministic state machine. It’s certainly possible to get these properties using asynchronous events as the messaging pattern, but experience has taught me that it’s significantly easier with a synchronous RPC-style approach. Similar to how CRDTs eliminate entire classes of complexity related to fault handling, synchronous RPCs eliminate entire classes of complexity related to queueing theory. They get you automatic backpressure, they let you take advantage of the queues that already exist at various layers of the operating system and network stack, and, importantly, they make it a lot easier to build deterministic components.
InfoQ: You concluded your talk by saying that "consensus rounds, or leader election, or distributed locks, or distributed transactions" are dead ends, and that large scale systems are going to use simple communication, and be eventually consistent. Tell us more about why you think that's the case.
Bourgon: I think the whole of human technological achievement has been an exercise in creating, reifying, and extending abstractions. The human capacity for understanding and managing complexity is essentially fixed; in order to make more and greater things possible, we have to use abstractions to “wall off” domains of complexity behind simpler models that can be more easily understood and built upon. For example, application developers today don’t need to know how their network interfaces translate binary data to electrical signals over copper, or ethernet frames to binary, or datagrams to ethernet frames, and so on — the OSI model’s abstractions enable the developer to spend their complexity budget on a much higher level. It’s basically stood the test of time, so I think it’s a pretty good set of abstractions.
Developers have also relied on the abstraction of a single, global truth in their data layer since essentially the very first database. It’s understandable and productive, so that makes sense. But I think it’s a bit like the Newtonian model for physics: it works, until it doesn’t. At really large scale, Newtownian physics no longer predicts real-world behavior, so we have to switch to the more complex Relativistic model. Similarly, when we start extending our data layer across large physical distances, I believe the abstraction of a single, global truth begins to leak. An incredible amount of engineering effort is required to maintain the illusion of atomicity, using techniques like the ones I listed. At some point, the developers working in and around this layer of abstraction are going to exhaust their complexity budget.
Typical RDBMS systems and common distributed systems techniques are incredibly useful and productive up to a certain scale. But past that scale, the complexity required to prop up the illusion of global atomicity becomes unreliable, unproductive, and ultimately unjustifiable. While there’s an upfront cost to thinking about multiple parallel universes of state in your application, I believe biting that bullet offsets orders of magnitude more hidden complexity in the alternate, leaky abstraction. And I believe that, eventually, we’re going to realize it’s the only way to build reliable, large-scale distributed systems.
For what it’s worth, this isn’t exactly novel thinking. The natural world is full of incredibly complex systems whose behaviors are emergent from simple primitives and rules. My favorite example is probably how groups of fireflies manage to synchronize their lights — no leader election or consensus rounds involved.
About the Interviewee
Peter Bourgon is currently leading research and development on a global infrastructure for state at the edge at Fastly, a CDN and edge cloud platform. He is the author of Go kit, the preeminent toolkit for microservices in Go; and several large-scale coordination-avoiding distributed systems, including Roshi (stream index) and OK Log (log aggregation).