Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Q&A with Akara Sucharitakul on Squbs: Akka Streams & Akka HTTP for Large-Scale Production Deployment

Q&A with Akara Sucharitakul on Squbs: Akka Streams & Akka HTTP for Large-Scale Production Deployment

Application requirements have changed dramatically in the past few years. Applications handling gigabytes of data with response time in seconds are a thing of the past. Today, users expect a sub-second response time and the amount of data is measured in petabytes. Hence, the old and inefficient approaches to building software are being replaced by the reactive way of programming.

Squbs (rhymes with "cubes"), an open-source project enabling standardization and operationalisation of Akka applications on a large scale, adheres to the reactive principles, allowing applications to process billions of transactions per day with minimal resource consumption while being more resilient to failure.

It is an asynchronous programming model which uses streams as the core of the application with all input and output considered as streams of events. Squbs is a good fit for use cases which involve collecting and processing large amounts of data in near real time as well as for a number of heavy-duty data processing and orchestration applications.

InfoQ recently interviewed Akara Sucharitakul, principal member of Technical Staff at PayPal and the Squbs project's founder, about the problems Squbs solves and the reception so far.

InfoQ: What are some ideal use cases for which Squbs fits the bill?

Akara Sucharitakul: Squbs can be used for just about any back-end and even some front-end use cases. So it is better to start with what is not an ideal use case. I'd say, if you want a CRUD service that just fronts a database and reads and writes from those, especially synchronous databases, Squbs won't buy you much. The cost of learning a new programming paradigm plus the cost of dealing with the synchronous interfaces from an otherwise asynchronous system outweighs any performance gains you'll get from a CRUD service, of course, unless you already have good expertise with this programming model.

The end-to-end streaming model added to Squbs 0.9 really promotes its fast data-streaming use cases. In this sense, Squbs has been used very successfully as both data collectors collecting vast amounts of data from the internet as well as processing very large data streams event by event in pseudo-real time. But that is just one side where Squbs fits like a glove. There are many other heavy-duty data processing and orchestration applications that benefit from the asynchronous nature of Squbs, especially with the integration of Akka HTTP allowing making service requests in a pure asynchronous fashion and using minimal resource overhead as well as adhering strictly to the scheduling model laid out by Akka.

Moreover, we see an increasing number of applications using streaming coupled with event processing to update the data stores, leading us to modern application designs with event sourcing and CQRS in a distributed fashion.

InfoQ: How has the feedback been from developers who have used Squbs to build their microservices?

Sucharitakul: Let me answer this question in two angles. First what has been the feedback in terms of developer experience using Squbs, and second is the feedback in terms of the resulting work. To answer the first part, we have to admit that asynchronous programming is an alien beast for traditional developers used to imperative programming. You have to re-learn programming concepts, unlearn what you've learned 20 years ago to be the right way of development, and learn new ways inconceivable 20 years ago. So training, reinforcement, and support becomes a very important factor. It is truly a people problem. And there may be developers that never manage to make this jump. Once developers get used to this new paradigm, we found them generally more productive, with less bugs and far less code to maintain for the same functionality.

For the second part of the answer, the resulting work has been nothing but phenomenal. All applications using Squbs provided feedback that they achieve a much higher level of resiliency in production, to the point it gets boring. Unless the system is otherwise faulty, the applications just keep running. On top of this, we generally see about 50% to 80% savings in one way or the other. One use case reported 80% cut in processing times, but more often applications report needing about 80% less compute resources to process the same amount of traffic. That generally means cutting 80% of the infrastructure cost.

So in general, despite developers reporting initial difficulties, we have very happy developers owning the end result of their hard work.

InfoQ: Since Squbs is built on top of the Akka toolkit, do you recommend any best practices to follow while integrating with Akka/Squbs?

Sucharitakul: Yes, but this could be a very long answer. There is a whole list of practices, even rules we give developers during training. Common ones are buzz-phrases like "immutable first" or "blocking is a crime." But there are many more. We'll publish the training modules for Squbs on Github soon. The first chapter will talk about many of these dos and don'ts.

Beyond the basics, I can think about some guidelines around streams and actors. To get to the point, unless otherwise justified, the backbone of your application should sit on streams, using actors as the periphery. While one can argue that streams are executed on actors and in Squbs terms they are always wrapped in actors, we do not think about applications the same way. An system takes input, processes data, and creates output. The core of the application interfacing with the integration points should, as far as applicable, always be streams. Actors are still very useful for holding state and as general purpose components that deal with the non-streaming aspects of the application including being one of the modes for communicating across multiple streams.

More information can be found on the project's GitHub repo.

Rate this Article