Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Presentations Cloudstate—towards Stateful Serverless

Cloudstate—towards Stateful Serverless



Sean Walsh discusses the challenges, requirements, and introduces us to Cloudstate - an open source project building the next generation Stateful Serverless and leveraging state models such as Event Sourcing, CQRS, and CRDTs, running on Akka, gRPC, Knative, Kubernetes, and GraalVM, in a polyglot fashion with support for Go, JavaScript, Java, Swift, Scala, Python, Kotlin, and more.


Sean Walsh is Field CTO at Lightbend, and has previously held various leadership roles in the tech industry. He is a distributed systems expert and has applied this expertise in the energy, financial and wellness industries, among others. He pioneered patterns for microservice integration that have since become standards, formalizing them in his book Reactive Application Development.

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.


Walsh: I'm Sean Walsh. I'm Field CTO with Lightbend. Previously, in my career, I implemented reactive and distributed systems in energy, and health and wellness, and financial. For the last over 3 years I've been Field CTO of Lightbend where I've helped the global 5000 do the same. Saw a lot of their challenges. I've had my own challenges over the years. I'd like to talk to you today about Cloudstate. Berkeley recently said that they predict that serverless computing is going to grow and dominate the future of cloud computing.

I'd like to talk about function as a service. It was visionary. It paved the way for some greater things, but it was really just the first step. How many of you have actually used function as a service? That's significant. Everybody else, have you heard of it? It's just the first step. It's important to understand that serverless is not function as a service. They're not equivalent. What are some good use cases for function as a service? Where throughput is key, short time windows of operations, embarrassingly parallel operations, low traffic, typically stateless web applications. I still would argue that they're not a great fit there because the function's a little bit too granular, orchestration, chaining, cron jobs, triggers, job scheduling, things like that.

It's hard to build general-purpose applications. What do I mean by a general-purpose application? One of the things I've practiced, and many of us I'm sure have, in building applications is domain-driven design, and the bounded context. The bounded context dictates that there are these bubbles of functionality in systems, so subsystems, where inside of them nothing is shared. It's inside the bubble. It's fine. Things are shared and known about but outside of that they're not really shared. They're that greater granularity. I'd like to think about something as a function as a service with that granularity.

Functions are stateless and short-lived. It's expensive to lose this context and rehydrate. Your state is never in-memory. It's never in your applications. It's never in the function. It's always in a database, or somewhere like that. It needs to be rehydrated before you can even think or do anything. It's always somewhere else. We've been lacking this co-location of the state and the processing. They're always separate, in some effect, each one is hobbled. There's no direct addressability on these functions. There's no need for it, in function as a service stateless, because they're just anonymous functions without any state. You could spin one up, it goes. It does its thing, whatever that is. Then return some data to the user. There's limited options for coordinated distributed state and modeling consistency guarantees. We're missing a really big piece of a puzzle here. It's really the elephant in the room, which is state.

The Need for Serverless Support

We need serverless support for managing sessions, things like shopping carts, IoT devices. It's not reasonable to have shopping carts, or IoT devices, or anything that you might model as a digital twin of something in the real world. Have it be expected to have that order of magnitude or really that context loss of having to connect to something, and grab its data, and then do its thinking and do its job. It's really got to be co-located with its data. We want to have low latency serving of machine learning models, real-time stream processing, distributed workflows. Interestingly enough, the technologies behind Cloudstate, as well as Cloudstate itself are going to now reallow things like ACID, so ACID 2.0. Being able to now think about how we can do Saga Patterns that aren't really observer patterns from the outside, integral with the application space, with the applications that are deployed. You can have true rollback and compensating actions and commits. Shared collaborative workspaces, shared blackboards, chat rooms, things where a lot of people are touching the same thing at once and it needs to be very whizzy quick. We need to avoid impedance mismatch. Pure domain is rarely in a viewing condition. I don't know if you're familiar, but CQRS, Command Query Responsibility Segregation, solves that problem. Your domain is something that you design. You carefully think it out. You craft it. Then the UI people will tell you that's not what they need. This solves that.

Technical requirements for a Cloudstate type. Endeavor stateful, long-lived virtual components, so actors. We already have actors. It's over 10 years old. We have Akka open-source. It was really easy to lean on the capabilities of that and abstract upon it. Options for coordination and communication patterns, we have point-to-point, broadcast, pub/sub. Using whatever you might like under the covers, could be Kafka today, could be something else later. CRDTs which are conflict-free replicated data types, which are an eventually consistent distributed view on things that coordinate amongst themselves, Sagas which are transactions. Options for managing the state reliably at scale, providing strong to strong eventual consistency to eventual consistency. Intelligent placement of these stateful functions, so you get this co-location of state, we need this co-location of state and processing. We need a clustered solution that removes this limitation of not having the processing live where the data is. Then, as a byproduct of this, and as a requirement, you'll also get predictable throughput, latency, and performance, startup times, storage, communication. It's just a matter of figuring out what the constraints are to allow this. We get useful constraints.


Function as a service is great at abstracting over communications. It's always been great at this. Operational concerns. You don't need to worry about it. You have your little function, gets deployed, messages come in. Thinking is done, logic is done, and the user function has probably retrieved some data from somewhere. Then a message goes out.

CRUD, it's a little bit more difficult. If you'd like to create a framework around it, it becomes a little bit more difficult to reason about what actually is going on here. You've got your message in. You've got your user function. Doing something with a database, who knows what? How big is it? How long is it taking? How many joins are there? Are there multiple databases? It becomes really hard to actually abstract any guarantees around this behavior. It's a leaky abstraction.

The problem is that stateless serverless is a big black box. We don't know what's really going on inside. There's the Wild West with the database, really hard to automate. Hard to have any guarantees to be able to do interesting things like being elastic, scaling up and down, provisioning, all that becomes very difficult when you really don't know what's going on inside those black boxes. There's an interesting quote, this actually comes from a theologian. Somebody found that freedom is not so much the absence of restrictions as finding the right ones, the liberating restrictions that will set us free.

Function as a service abstracts for us communication and it's got a deployment around it that's usually automated. As we know, we get the message in, do some thinking, and a message comes out. What happens when we do something different? What about if we abstract over state? Stateful serverless does the same exact thing. Messages come in, there's a function, and something happens and messages go out. Also, what's happening is state is coming in, asynchronously at different times than the user requests, and state is going out. What is state here? We need to really think about what that is. It's a great idea. We're not going to parse the entire dataset in. We're not going to parse an entire database in. That's not efficient. That's not going to work.

What Is Cloudstate?

Enter Cloudstate. What is it? A lot of times we build stuff and we talk about what it does. We don't really say what it is. I thought it was important to have this clear sentence to say what it is. Cloudstate is distributed, clustered, and stateful cloud runtime, providing a zero-ops development experience with polyglot client support, essentially serverless 2.0. You can do it in JavaScript. You can do it in Golang. You can do it in Scala, whatever language you choose. If it's not implemented, you can implement it yourself. It's just gRPC under the covers.

Also, it's an open-source project, Apache 2.0. We haven't talked about reactive applications yet. People are familiar with reactive? A lot of people are familiar. Reactive applications mandate what are the four pillars of a distributed application. They are to be resilient, embrace failure, to be elastic, scale up and down with need, save costs when there isn't any need, to be responsive in the face of load, in the face of user requests. Nobody is willing to wait for a page to load or anything like that anymore. That's all made possible by asynchronous message parsing. Those are the four pillars of reactive. Building a reactive application up until now has taken some expertise and either requires help from a consulting company, or Lightbend, or me, or consultants, or something. It's not an easy task. We set out to make this easy. We want everybody to be able to be comfortable to build an application to house a billion shopping carts without being concerned that they're not going to fit in-memory, or you're limited to an individual node. We've created the reference implementation for a standard. We created a protocol around it to back the standard. All this is at It's also published in GitHub.

We want you to focus on your business problem, just like function as a service. You're concentrating on the function. You're not worried about how it's being deployed, when it's running, how it gets started up. All that stuff is abstracted away from you. You can just be a developer and leave the heavy lifting to your platform.

Don't worry about the complexities of concurrency in distributed systems, all that's abstracted away for you. No more synchronized. No more locks. No more things like that. The distributed state and the replication, the persistence store behind everything is managed for you. We use Kubernetes very heavily for abstracting over persistence, messaging, pods, all that stuff. We use Kubernetes first, and then we integrate it into what Akka already has. We're using Kubernetes for service meshes, Istio, any databases provides. We're polyDB here. We don't care which way we go on databases, and other infrastructure. Message routing, scalability, failover, recovery, all is part of the framework. Running, operating your application, quickly instantiating new pods, all as the application framework, Cloudstate.

It is polyglot. The very first sample application we built was JavaScript, just to show what it can do. To show what you can actually do in JavaScript. We feel that JavaScript is the opposite of Scala, which is a programming language that Lightbend is behind. It's Typesafe. It's functional. It's actually deemed a little bit complex by some. A lot of people are out there doing JavaScript. Java, of course, Go, and then upcoming support by Python, .NET, Rust, Swift, and Scala. The sky is the limit. We're also PolyState. We were using these eight years ago. We're building systems using event sourcing, and CRDTs, also CQRS. Because that was the way we had to solve our problems. We had IoT devices in energy in the millions and we needed to control them and read them in real-time. You couldn't do that with a stateless application in Spring, or WebLogic, or anything like that. We needed new ways to do things. They've been embraced as part of Akka, and now they're embraced as part of Cloudstate. polyDB, SQL, NoSQL, if it's supported by Kubernetes, we could support it as well.

We're leveraging Akka fully. There are no capabilities other than the abstractions in Cloudstate that don't already exist and are tried and tested in Akka. That is the event sourcing. It's the CRDTs, CQRS, the actor interactions, the clustering, cluster rebalancing, failure, all that stuff is handled in Akka already. We just needed a way to put it together in a really nice, packaged way. gRPC, also offered by Akka. That's our language of choice when you're interacting with Cloudstate, with your applications. Cloudstate's interesting because you've got a clustered application with what we call distributed entities across your cluster. You can have billions of these things. Also, your application code that is consuming it and interfacing with it is also in the same cluster, in the same sidecar, the Kubernetes pod and sidecar. Therefore, you have ready access to everything. It's really faster deployed in the same environment. gRPC has very little expense associated with it. We're embracing Knative, for nice Kubernetes abstractions as they come out. GraalVM, because it's really fast to provision new instances of things in Graal. Everything is Kubernetes.

The Cloudstate architecture. We utilize Kubernetes pods. Then what we have is user functions that are deployed on each of these pods. You can create them in your language of choice. They are highly prescriptive in how you do these things. This is when I said that sometimes constraints can be helpful, the right constraints. These user functions, for example, a shopping cart, it's very constrained as to how you build it, but you can build it in whatever language you want. There are certain markups. There are certain functions that you need to actually implement to get it to work. Once you get that working, you could freely interact with it with gRPC.

Cloudstate has an Akka sidecar also spread across these pods. In the sidecar, it's actually hosting actors to represent these user functions. It's also hosting your stateless application interaction framework. The things you write to actually call out with gRPC. You might want to subscribe to changes on something. Then when you get the change, you do something. Maybe you pump data back out to a webpage, using WebSockets, or something like that. The user hits the sidecar. The sidecar connects via gRPC to the actual deployed user functions. Then the resulting data goes into the datastore. It could be read from the datastore, which is shared across the left and the right sides.

The user functions, they're spread across these Kubernetes pods, and so are the Akka sidecar. Your Akka cluster is represented on the left. It's the same old Akka cluster we've had for quite a long time, only we are coexisting with Kubernetes pods here. The HTTP or gRPC, if the user wants to do gRPC, that could do that too. Comes directly into the cluster as a whole. There are locator patterns that will actually say, "Where is this user function running? Where is it? Which pod?" We can actually make that sticky, so it's very performant. That'll be translated to gRPC by your user code that's on the left side. It'll interact with a function and then return some data. Then in between, you're getting typical cluster gossip, routing, replication, rebalancing. All this stuff happens in Akka cluster behind the scenes. Of course, your datastore behind that, abstracted by Kubernetes.

When being a managed service with Cloudstate, there are opportunities to do pay-as-you-go. Cloudstate is a Kubernetes install. I think it's three or four terminal command lines, once you provision a Google Cloud, or whatever cloud of your choice. After I've done it a few times, it takes 10 minutes for me to deploy Cloudstate with my application. A lot of this stuff, it's relatively new. We haven't fully developed everything that we want and everything I'm talking about here. It's something that we're rapidly iterating on.

You get on-demand instance creation, passivation, failover, autoscaling up and down. Previously, I was at Weight Watchers. We did the digital transformation and the system before I joined. We went microservices. When bathing suit season approached, their systems went down. When the holiday season approached, their systems went down. Somebody would painstakingly set up their diet across all the holidays. Then the system wasn't working during the holidays. In the times when Weight Watchers was to be most profitable, their systems were failing. This autoscaling is going to prevent things like that. What we may call in retail, Black Friday scenarios.

Then ZeroOps. I don't know about you, I've had to do Ops, but I really like to code. I like to create my business code. I like to solve business problems. I like to see it working. The energy that we spent previously, getting these things into the cloud, even with Kubernetes, even with Docker was significant, required a lot more people than just the business logic team. ZeroOps to me is really important. I like a framework that is going to make this push button, where it just takes that heavy lifting off of my shoulders. Automation of the message routing and delivery, state management, cluster sharding, the co-location of the data and the processing, replication consistency. Automation of upgrades, provisioning, deployment, canary deployments, things like that. Every single place I go, they're different. It's equally hard.

Akka Cluster State Management

Akka cluster state management. I have a really good picture at the end that shows what a reactive application looks like. This is a little bit of a hint at it. We've got all these Akka sidecars that form an Akka cluster. Gossiping and locating individual, singleton entities among them. It's masterless, decentralized, self-healing. Akka, at its core is actor based. Actors are resilient because that's reactive. Every actor is potentially the parent of another actor that it spawns. The system has an ultimate parent actor. Every actor has what we call a supervisor strategy. There is no try catch. There is no, "I wonder if I'm going to write some error handling today." Your failure and success scenarios are first class in different spots.

We have the Akka sidecars that represent the Akka cluster that are sidecar'ed along with the Kubernetes pods, all with their gossip protocols. Self-healing, I think I talked about resilience at the core of Akka, at the actual level. It's also at the cluster level. When something fails or a node becomes unhealthy, things are moved to healthier nodes. New nodes are provisioned as needed. You get that all with Cloudstate. The user functions are deployed alongside in the same Kubernetes pods as the Akka sidecars. The state sharding, your entities are sharded according to their key. The key is usually arbitrary. You want to shard uniformly. You route based upon entity key. If I'd like to interact with a user function, also called an entity, I need to know its key before I can send it a command or do anything with it. Any gRPC call that is associated with any of these entities has to have that key.

You parse in your key and your command. It's possibly forwarded or it's routed to the correct entity to handle it. Of course, we have the co-location of state and processing backed by the event log. Your database isn't really a database as you understood it in the past. It's an event log. This is one of our prescriptions, or constraints. We are using event sourcing. I know almost everyone must have at least heard about event sourcing if you're not using it. In the age of analytics, it became really important before we even used it for what we thought we would, which is just this immutable event log around the domain instead of CRUD. Once you go for events, events are really a friendly way of being able to hydrate state. They have a good granularity. They're ordered. They can be represented by snapshots or an optimization. If you've got so many events, that hydrating would take too long. You can start with a snapshot of your current state and then overlay the events over them. That's one of the major constraints I think that makes this possible, events.

Then you get to this automatic failover, rebalancing, rehydration. This is the unhappy path. What happens, the whole node failed? It's unhealthy. There were user functions being hosted on that pod. What do we do? Akka already knows how to do this. It shifts into a healthy Kubernetes pod. It is rehydrated from the event log and/or the snapshot store. Then you can now return data back to the user.

Cloudstate uses better models. We're reactive. You can't create a platform like this unless you're reactive. I'm not saying that the reactive semantically is the only way to build systems. I'm saying, tell me that resilience, and elasticity, and responsiveness are not a good way to build a system. We don't know how we do it otherwise. These are battle-tested yet constrained event sourcing. We've used it for a long time. We knew it was an answer. CRDTs which are a way of getting distributed state in a way that's highly accessible, highly available, yet coordinates without a central hub. They all coordinate together. There's a lot of logic that was already dictated. I think it was a 2011 paper by a guy named something Shapiro that dictated, what are the types of CRDTs? We just implemented the types. The devil is in the details.

CQRS, Command Query Responsibility Segregation

Then CQRS soon, I'm pushing really hard for getting this as soon as possible. If you're building an application and you're only using events, and CRDTs, it's a little bit imbalanced. The read side is so important, just as important as your domain is, the projections of that domain. Again, Command Query Responsibility Segregation. It's simply separating your read concerns from your write concerns of your system. Your command side is covered by the event sourcing. A command is sent into the domain, "Do this. Assign an internet device to a room." You've determined the logic. The user function says, "Can I do it? Should I do it?" Event sits out and says, room assigned, returns the data back. I like to say state is in the eye of the beholder. The state that's contained inside your user function, which is really your domain-driven design entity, that's not necessarily the state everybody is interested in. Different use uses or use cases have different needs for state.

A really clean example is, order to cash. You've got customer's orders, inventory, receivables. You might have a dashboard that management looks at that looks across all these things. If you're using one microservice that contains orders, you're not going to get the complete picture. If you put the onus on the order service to give you back this data, to return this data. You're tying these systems together. You're coupling these systems. I've seen it done. It's really painful. If you separate them completely, you could use CQRS. There are semantics around consuming events and asynchronously updating read projections, we call them. That's coming soon, with my weight behind it.

Event Sourcing

Event sourced entities, this is the happy path. This is how they usually behave. You have your user that sends a command into the domain. It's bounded by a mailbox. Every Akka actor is bounded by a mailbox. That's how you get no concurrency concerns. Because you're only ever looking at one thing at a time, updating your state fully, and then you're ready for the next thing. The command is now thought about, computed on by the entity. Then an event goes out to the event log, which is conditionally sent to an event bus and shared to the ether. You don't need to know who your consumers are. In most cases, you shouldn't know. You're completely decoupled. This allows systems to interact with each other via events, not some construct that we as developers invented, to try to communicate across services. Events exist whether you're harnessing them or not. They exist whether we're computing or not? Events have always been there. It's fully ok to use them across services. That's your happy path.

Unhappy path, we've got a problem. How do we recover? It's actually pretty simple. You have your event log. You replay your events. You bring up your current state. You're rehydrated. You're ready for business. Command comes in, you do your thing.

The benefits of event sourcing. It's a single source of truth. It's not somebody's idea of a source of truth. It's not many ideas for source of truth. It's the source of truth for a certain domain. It's got durable in-memory state that it allows for a memory image. Your state is built up from these events. The state that's built up in one place from the events could look very different from the state built up another place. An example I use is flight. Let's just say we have an airliner. You've got ground control. You've got flight control. Flight control cares about a great many things different from ground control that is just interested in a winged vehicle on the ground. Let's get everything out of the way, whereas you need to know about weight, heading altitude, all other stuff in flight. They're still the same flight though.

Avoids object relational mismatch. This really is in tandem with CQRS. You get your command side and your read side fully separate. You won't send a command to me on a customer and expect me to send you back things I don't even have. You get to subscribe to these state changes. If you're interested in the state, in my entity, in my user function, you can have the state. You can also subscribe to the state changes via events because events in to hydrate, events out. Your state changes are also represented via those events. Single writer principle, you could use databases optimized for writes. You could use networking that are optimizers for writes, has great mechanical sympathy.

Event sourcing deployment, you have your user function entity. You get your event log in asynchronously. You get your command in, which is the user interaction, and the reply out. Then the event goes out to the event store. Usually, when you do a state mutation, the event happens first. As a side effect, you update your state. That's because if you have a failure, and you haven't fully written out at your event and you've updated your state, that's a problem. We really want to make sure that state is really the record of something happening. Make sure that happens first.

Conflict-free Replicated Data Types

CRDTs, they provide strong eventual consistency. There are these methodologies in place that say, with reasonable surety that this is the answer. However, because we're talking eventual consistency, you could be looking at something over here that hasn't quite reflected what's happened over there yet. This is a trade-off that can be tuned. It must be understood. We don't ever want to go with strong consistency, because that'll break our scalability. If you want to look up something interesting, look at the universal scalability model by Neil Gunther. It actually proves that.

It's deterministic by design. The data types contain their own resolution logic. You could have ones that are additive. You can have ones that are keyed state, key-value store types that agree on what is the current state of a given thing. They have idempotence built in, replicated according to the needs of the cluster and the usage. I really think they are a read optimization. I'd probably go to a CQRS read projection first, but these CRDTs are really optimized for low latency.

These are some of the types that you get from CRDTs: counters, registers, sets, maps, graphs. They're associative. It doesn't matter how they're grouped. They're commutative. Order-insensitive, order doesn't matter. It doesn't matter if something came from here first and then came over there. Your end result is the same. Idempotent, so if you get state change, the underlying detail uses an event ID that has some time component in it. When it updates itself, it'll know that anyone that comes later that has an old ID is not valid.

CRDTs, you have your user function entity. These user function entities, they are backed by either events. You can have an event sourced entity or a CRDT entity. Right now, a CRDT entity is a fully non-durable only in-memory. That won't be for much longer. Event source entities of course are backed by the event log in the database. You get your state and deltas in. This happens among the CRDTs. You get your messages in coming in from users interacting with the CRDT. Both of these things mutate the current state inside that entity. You get your message out, your response to the users. You might have updated your state as a result. The deltas and the states go back out. The CRDTs can now coordinate amongst each other.

Using key-value for CRUD. You get your snapshot in by entity key. You get your message in to retrieve that snapshot or change that snapshot. Now you get your response back. You've got your new snapshot, which is now being distributed and agreed upon with all the CRDTs.

3-Tier Architecture

I want to show you what a 3-tier architecture looks like. We're all familiar. It's really noisy, and it highly exercises the entire framework, the entire cloud infrastructure you have. The application tier really isn't of that much value because you got computing logic, but you've got all of your state over to the right side of the database. You're constantly hitting that. If there's a failure anywhere, you're going to have problems. Also, it's just a lot of complexity going on there. In addition, there's probably a lot of chatter between those nodes in the middle. Services calling other services, things like that. That to me is functionally blocking behavior.

When you look at the reactive architecture, you're seeing that the database is really a side effect of doing things. It's not needed in real-time. Me as a user doing something with the system, I'm not causing you to touch the database. Everything's in-memory, things are all happening. It's super responsive and resilient.

A quick CRDT entity example. This is one that was developed by Viktor Klang, who's the Head Architect on the Cloudstate team. He's also called the Legend of Klang, works for Lightbend. This is a simple chat app. This is a presence CRDT that says simply, is the user online or not? This is an agreed upon value among all the nodes is, is the user online? You create this entity. This is Java. You annotate it as CRDT entity. It's a vote type entity. You give it a vote parameter. Then all of these CRDTs will have votes that interact with each other using that vote variable. They're all going to determine whether or not the user is online or not. You instantiate your CRDT saying, new Cloudstate. You register it, and you start it.

Here's our command handler. The command handler for connect. There's a gRPC interface. This is in JavaScript, if you look at the samples on Cloudstate. This will be in JavaScript. There's a gRPC Connect. It's a streamed gRPC method. What that does is it's just you connecting from a webpage. Then the vote is set to true here, the user is online as per this instance of the CRDT. Then what happens is you now subscribe to canceled. When the stream is broken because the UI closes down or disconnects, or whatever, you now can vote false on your own behalf. As long as the user is not online somewhere else and hasn't reconnected in the interim, the user is no longer online. You could do a read from this entity. You can see that nobody's online. That's a CRDT. Then you could also subscribe to online status. This is a subscription example. You want to monitor it from a JavaScript application. You can register a callback here, and on change of whether or not you're online or not. You could do something on the front-end.

This is one I wrote, which is an event sourced. I try to keep it real simple. I got the idea because I have eeros in my home. I love them. One of the things you can do with eeros is you can assign them to rooms. I start thinking about how would I design a thing like that with internet devices that I can buy and assign to rooms? That's my simple example.

This is a Spring application. It's Spring first. It's mostly Spring. It's Spring Java. It's supposed to be comfortable. Supposed to be familiar for people that are doing Spring. If it was Golang or JavaScript, it'd be equally familiar for them in their languages. It's typical POM file for Maven. Everything's very Spring like. There's just an additional dependency and a couple plugins for Cloudstate, but it's very Spring. If I look into the source code, I'd like to show you the entity first. This is my device entity I've modeled. This is my device entity. This is an internet router, as part of a service router mesh that you could purchase and assign to a room, very simple stuff. It's got a device ID, which is unique in the system. You can't have two of these IoT devices that are the same ID. That also happens to be the entity ID. I like purity. I don't really want to use entity ID as my device ID. They'll end up being the same but I want them to be separate. The customer could activate it. The customer can assign a room. I would make this an option, but it's not really a user functional option. This is just because this is really uninitialized in the beginning. The user will never see it as a blank string.

The activate device was the first thing you can do. I buy the device. I go to Best Buy, or whoever. I buy one of these things, and I get on my phone, or whatever, and I activate the device. If I'm already activated, because when I've been instantiated or hydrated as this entity, it was done by a key. If this key already existed and was activated, I would fail it right away. You can't activate the same device twice. Then what I would do is I would emit the event. What I'm doing is, in my domain, I define the event in gRPC. It's easy to share. Then I'm just using a builder pattern in Java to actually instantiate this event, and emit it. When I say emit, the framework in Cloudstate is actually going to save it to the event log. It's going to possibly put it into a bus to share it with others. Then it's going to do another callback for me. This is just a command handler, how to activate. I also need to implement event handlers to now update my state.

I have to side effect my state, because if I update my state at the same time, and the event fails, that's a problem. I've got a problem in-memory. I think I'm here, but I'm really not there because I'm not backed by the events. When I go out of memory, or the system is restarted, I'll lose it. Here's when I'll set my state. These are local variables in the entity, private. I'll say, now I'm activated. I've got a device ID. I've got a customer ID, because they were parameters when I got the activate device command. All this is on GitHub.

Then assign room is very similar. You have a command handler with Assign room. This maps to the gRPC function which I'll show you. The command Assign room has the parameters that are necessary, the device ID and the room that you'd like to assign it to. You now emit that event and your callback with that event handler now, room assigned. You're setting the room equal to the room assigned in the event. That's as simple as that for an entity.

I'll show you the gRPC. This is the service. In gRPC, this is how I define the service. I've got the Activate device command. I've got the Assign room command. In my service, I'm creating a stub here which says activate. In the activation function, I expect to be provided an activate device command, which will have the device ID and the customer ID. If you haven't seen gRPC, it looks a little funky at first, where you see device ID equals 1, and customer ID equals 2. That's positional. What you're doing is you're saying the data type first. Then you're saying the ordering over here. That was weird to me when I first saw it. The device ID, you're actually dictating here that that is the entity key. If it was customer ID, I would have put it there. The customer ID is not the key. Device service can now assign the room or activate. That's it for the gRPC part.

I also said that I was using the events. I modeled them also in gRPC Protobufs. I've got my device activated. It's got a device ID, and a customer ID, and room assigned. The events in event sourcing, you want to have all the information that's applicable to that event. You don't want any other stuff in there.

Then very quickly and finally, I'll show you that it is truly a Spring application. Here's my Spring application itself. I've just annotated a Spring Boot application. In start up, I do a new Cloudstate. I register my event source entity and I start it. Then I have a typical Spring MVC controller. Here, it's just Spring stuff. I'm saying, register device and assign room. It does a gRPC callout using the stub. These stubs are generated at compile time for you based upon the gRPC interface that you've written. These things will generate, you just call them, and you call services with them. That's the code.


This is it, Keep an eye on it because I'm going to be iterating and doing other things too, other languages, other ideas, adding read capabilities, and things like that.

Participant: I'm currently learning Akka. Even as a beginner, I see lots of overlap between Akka and Akka Persistence and the pattern you presented. Could you clarify a bit the future roadmap of Akka? Is it phased out or are they planned in different sectors?

Future Roadmap of Akka

Walsh: Cloudstate is meant for the masses. Many people are just going to continue to use Akka because you're going to get ultimate flexibility in Akka. We're going to create abstractions in Cloudstate for the things that are most common, which is right now, we have another framework. Are you familiar with Lagom, a Swedish word for just right, not too big, not too small? Lagom is the same thing. It's an opinionated framework using Akka underpinnings to do a certain thing. Cloudstate's the same thing. Akka is never going anywhere. If it wasn't in Akka it won't be in Cloudstate. That's our philosophy. If I go and I'm building a system, especially right now, depending on the developers I have on hand, I may use Akka or I may use Cloudstate.


See more presentations with transcripts


Recorded at:

May 19, 2020