Bio Viktor Klang, Also known as √, is a passionate programmer with a taste for concurrency paradigms and performance optimization. He's the Tech Lead for the Akka project at Typesafe.
Scala Days is the premier event for Scala enthusiasts, researchers, and practitioners. Scala is a general-purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way. It smoothly integrates features of object-oriented and functional programming.
That is an interesting question, what would I say, I am the Tech Lead for the Akka Project, I work at Typesafe. I spend a lot of my days or a lot of hours during the days, doing concurrent Programming and solving people's race conditions for them. I would describe myself as a sort of industry programmer, I’m not on the academic side of programming, I’m a business developer or something like that.
Akka is a platform to allow you to create concurrent and scalable and fault tolerant systems on the JVM and we sort of started because it's so hard to do correct programming, correct concurrent And parallel programming using Locks and synchronize blocks and Java.util.concurrent stuff for the normal programmer, so easy to shoot yourself in the foot, and you are also mixing concerns in your code, so you are sprinkling coordination code with actual business code in the same place and it sort of proliferates in your codebase, so when you find that you have a problem you need to redesign to get better scalability, you risk having to rewrite a lot of your application because it's so sprinkled into everything, so what we try to do is to give you the tools that you need to solve different kinds of concurrency and parallelism problems. So there is no like golden hammers, we have a lot of different tools, not a lot but quite a few tools, like Actors, Futures, STM, Transactors and stuff like that, so that’s what Akka does for you, makes it easy to do concurrent and parallel programming and also using fault tolerance mechanics.
Absolutely, so the thing is to make something fault tolerant, you can’t really just tack it on afterwards, you can’t have it as some sort of orthogonal thing that you just add on when you need it because fault tolerance is something that is sort of core to your application, is it fault tolerant or not. And what we did is that, the actor model is quite an old model and I think Erlang is one of the more famous ones for supporting that kind of style of programming and Erlang does a lot of things very right and we wanted to do something like take the best from Erlang , put it on to the JVM and see what we can do with that, what can we do that people would actually use on the JVM. So a lot of things are based on the thoughts of Erlang but with a twist because we are making this now and Erlang has a quite longer history, so we have the opportunity to makes changes now that Erlang might not have because its legacy or where it is in it maturity phase.
For example we have Actors which is our version of the Erlang processes, of course we are on the JVM so we don’t have to have separate heaps for our Actors if we don't want to, we don’t have the same garbage collector capabilities that Erlang has, but we can optimize because we optimize for the JVM and then we can do distributed computing using serialization for example, but our thinking is if you start out by designing stuff to be distributed, it’s way easier to optimize when you are local, than design for being local and then trying to make it work in a distributing setting, because you have weaker guarantee in a distributing setting. So a lot of the good stuff are inspired from Erlang and its fault tolerance capabilities for example, like with supervision of the Actors and stuff, but we have a lot of our own takes on things, for example Actors in Akka are hierarchical so you have an Actor system that lives on the top, and there is, what we called the guardian that is the root of all the Actors, so when you create Actors inside Actors they become children of Actors.
So when you have a failure in a child, that failure will be propagated to the parent which acts as a supervisor, and says: "Ok, can I handle this problem? Can I restart this child or do I need to terminate him, or can’t I deal with him so I need to escalate to my parent", so if it escalates all the way to the top, it bubbles up all the way to the top, then your application quits because it was something so bad that nothing in the layers above could do something about it. So it gives sort of the built in supervision and fault tolerance which Erlang does not enforces in that sense, you need to do it manually. So that is something that we have done that is very separate from Erlang but solves the same kind of problems.
Viktor Klang: Yes, you link between your Actors and you need to react to the linkage.
Werner Schuster: But you don’t have to?
Viktor Klang: You don’t have to, in Akka it is by design, because we find out that, and this is new from 2.0, because what we found is that in1.0 we were doing the same thing as Erlang does, you would link between your Actors and you would try to solve that yourself, but what we found is that people normally don’t do things unless they really have to, so people would do a lot of Actors and then they would have problems because they didn’t have any fault tolerance, because they hadn't added any supervisors, and then you have to add it in afterwards which is always really hard because that can affect the design of your application, so right now we just give it for free because it is already built in to the model, it's just going to work, and you don’t put people in a bad place without them knowing that they should have done something else, so trying to get it easy to do it right.
Werner Schuster: Fault tolerant by default basically?
Viktor Klang: Yes.
Futures is sort of a, we have a saying in Sweden like: "A beloved child has many names", so yesterday I talked about there is like a plethora of implementations of Futures, you have the primordial java.util.concurrent Future with this accompanying Future Task. There is probably like eight different Scala libraries that have their own Futures, there is even a library that acts like a bridge layer on top of different Future implementations. So we sort of felt like it’s was time to pick the best ideas like: "What are the use cases, what do people need", put that into the standard library so you get better inter-op between different libraries.
Of Scala. In Akka we have Futures already we’ve had them for quite some time and the Scala standard library has also but it has been quite tied to the Actors. So you can implement Futures using Actors but you don’t need to. So the Akka Futures and the Futures that are going into the Scala Improvement Proposal 14, they are not based on Actors at all, so for us it was important to define what a Future is, so it’s easy for people to reason about what it does. For us in our definition is that a Future is a read handle to a value that may be available within a specific time frame. You can read it many times but you can have only one value in it, so whenever it gets a value it's always that value, so it’s write once, read many. And the accompanying thing with the Future, you can’t write to the Future, you just read from the Future. So on the other end, at the producer end, you have the Promise which is a write handle that you can write only once and that you should write to within a specific timeframe. So when you have a Promise, you’ve made a Promise so now you are obliged to keep it, but it's a best effort thing. They are tied together but they are not the same thing, so two sides of the same coin.
You could do that, it’s like a write once channel. So the thing is that with Futures you can do Dataflow-style programming, so I know there can be a value of this type in this Future, so you can do asynchronous composition on top of that Future, so if I get this result, apply this transformation to the result, yielding me a new Future, so I never block for anything, I can compose my code, without even having a value and at runtime there might not even be a value produced or that could be an error signal to the Future. So a Future has really three states: one it’s pending, doesn’t have a value yet, two it’s successful, it has a value or three it’s failure with some sort of Throwable in it. So you can signal both a successful thing and the failure thing and also you have the case where it doesn’t have a value yet.
So this allows you to do very declarative style of programming without doing any blocking, because the java.util.concurrent Future, the only thing that you can really do with that is either block to get the value or you need to poll the value, so you need to have some sort of loop or something that polls the value . And then you can’t compose anything, so this way you can really spawn off, eg I call this web service over here, I call this web service over there, I get Futures back when these complete I will take both the values and pipe it into some method yielding a new Future, so you can really have Dataflow-style programming, and the good thing here is that you don’t need any locks, you don’t need any weirdness, it just simple, in Scala Improvement Proposal 14 it’s a Monad so you have the map, the flatMap and everything, so you just simply treated as a data structure.
So it’s very good for where you have a known flow that you need to direct, but it's not like a pump so it doesn’t generate multiple Futures, it's not an execution thing per se, it's more of a flow of data from one point to multiple other points if you might want to do that. So it’s a different kind of thing from Actors, Actors are more like a computation engine, people sometimes say: "Actors are very complex" it’s sort of non-sense because they are very, very simple, it’s simply a closure that’s tied on the top of the queue or a mailbox, which can be rebound to the same point. So the closure can rebind itself with a new closure to the same point to get different behaviors. So it’s very, very simple, but it’s a completely different thing from Futures, but they can inter-op, so what we do is that we support something called an ask operation for an Actor.
An Actor normally just you can send it a message, that is the thing you do, you send it a message, sometimes you want to ask it something, so you want a reply back so to make it easier to program and not having to create a new Actor that you pass in that will represent a continuation of the computation of the question that you asked, you can get a Future back with the result, so you send something in, and whatever it responds with will go into a Future. So that is the bridge from Actors into Futures, but you can also do so that when you have the result of a Future you send that result to an Actor, so then you get the other side of the bridge.
So you can really pick and choose and inter-op what makes sense for you in a certain case, so you need to call into something that yields a Future, fine, just call it and then when it’s done you just pipe it to wherever you want to pipe it, pipe it to yourself or pipe it to some other Actor. They have a good inter-op story but there are completely separate concepts, but it's interesting since they actually can be used together in a very non-intrusive way.
Yes, everybody loves the Monad. Monads were the hottest thing in like 2008-2009, and then every year there is a new sort of CS concept that is the thing that year, I think last year was probably Iteratees, the year before was probably Applicatives, and the year before was Monads, so I see a trend here.
I think it’s starting to become Lenses actually, I think Lenses are highly ranked to be CS word of the year.
It sort of allows you to change state inside an Immutable hierarchy if I got this right, I’m not the CS guy, so for example if you need to clone an immutable object that consists of immutable objects you need to clone something deep in the hierarchy is sort of really messy to do that, so if I get this correctly and Tony (Morris) will probably flame for this, it's a representation to say: "Give me something that represents that point but I don’t have to look at all this stuff", it’s sort of a lens into a structure.
Yes, so what is fairly common and you always have the choice to be explicit and use the monadic operation yourself or you can use the for comprehension to get the sugar on top of that, so what you need to be aware of is that since Monads or Sequencers if you create a for comprehension and you x<-someFutur;, y<-someOtherFuture, if you create that second Future that will only be executed when the first one completes, because this is asynchronous and it’s using flatMap, so not until the flatMap is executed on the first one, you will actually produce the second one.
So if you want to be parallel about things, you need to start your flows or start your computations before the comprehension or before your flatMap, so you'll actually be able to get the parallelization. So it could be a possible source of questions when things are not behaving as you would think, but it falls out of Monads being sequential and Futures inherently being able to use parallelism.
12. The last question about the Futures, you mentioned yesterday we have dozens of implementations of Futures, why do they exist, do they solve different problems or can you unify them, what is the solution there, or what problem do they solve?
There is sort of different, first of all the definition can be slightly different and also I’ve seen that some are inclined to add more behaviors to Futures itself like cancellation, some other perhaps domain-ish specific methods that make sense in a certain domain but they might not make sense for other domains, so that could be a source. Also if you don’t have a really good standard library implementation, people are more likely to implement their own, even if it's just something simple as performance, people are very likely to create their own just to get better performance because getting changes into a standard library takes way longer than getting your own Future stuff up and running, and also they have to disseminate things, so if people are going to running on a version that does not have the improved stuff, what are you supposed to do.
So I think it's crucial to have a really good story in the standard library so people don’t feel obliged to create their own, and also when you do something in the standard library it has to be as abstract to support different use cases so it has to be somewhat more abstract than something that perhaps you will do yourself. And that is really hard because you need to see the similarities between the implementations to actually find the optimal thing that you can put into the standard library, so that really helped us be able to watch other implementations and see what works and what does not work. We’ve been debating stuff for a very long time, like how do you solve cancellation, should you solve cancellation, where should cancellation go?
For example if you put a cancel method on a Future like you have in java.util.concurrent , if you pass me a Future I call cancel, and you have passed that Future to someone else, I’m effectively ruining something for someone else and that's a sort of scenario where you would get into like defensive copy and stuff like that so you would try to protect yourself from the evils of others, so I’ve been really reluctant in putting cancellation into the actual Future, because it does not make sense when you are using the Future in a parallel setting, it only makes sense when you are using it in one thread only, so you are spawning off some computation, you know that you have a sort of exclusiveness to that computation so you can do it. We try to make sure that it's abstract enough to avoid people doing defensive programming, getting things wrong, etc, etc. But it’s immensely interesting to discuss these things because they're not black and white, there are really huge spectrum of gray.
Yes absolutely, especially if you have a cancel on a Future and that Future represent a computation being run somewhere, perhaps some other machine in some other place in the world, and you call cancel on that Future, now you have to have a bidirectional link between, you need to be able to communicate back and you might miss, so the result comes back on the wire, while you are sending the cancel on the wire, and now you need to make sure that you handle these cancel races also on the other end, and what happens if you have cascading effects, so that Future over there has other things on other nodes, so where does the rabbit hole End? So it’s a tricky issue, you can get it right but it will cost you quite a bit. So we opted not to added now for the initial implementation, who knows if it's going to go in, but very interesting topic for me, at least.
So we’ve had quite a long story now, I think we’ve been out for three years now and up until Akka 1.0 we were sort of adding stuff, we are experimenting saying like:" that cool stuff over here, what can we do with that, does that fit in?" and then you need to get feedback from people, how are you using it, is that really solving that problem, does it gives you any problems that we didn’t really think of when we added it. So we just packed in stuff features and then we got feedback on that and decided, ok, now we sort of collapse, we need to remove things that do not work, we need to improve things that can be improved and we need to sort of collapse similar functions, similar code paths, really straighten things up. So that is , everything we learned from the 1.x series, removing things that didn’t work, harmonizing things that you can do in different ways, so I thing in 1.x you can reply in six or seven different ways and the cognitive overhead for somebody learning when to use what, is just unjustifiable.
So in 2.0 there is essentially one way which is sender!, which is our send operation and then the message. That just works so is way lower overhead to reason about the code because it's so much simpler. So that was the goal of2.0, really make it a good stabile platform that we can innovate on top of, but really does the ground work for everything you want to do, so you asked about where we want to take it, so for the 2.x series we are going to add clustering which is a sort of a peer to peer-based clustering based on top of Apache Dynamo, some of the stuff the Riak guys have done, that is really awesome. Essentially we have the problem where we have to cluster live objects. We are clustering actors, it’s an active computation, it’s sort of different from clustering data, which does not really have any active behavior, so we opted to create our own implementation, that is what we are doing right now. Also we want to increase integration opportunities for example, we have a very successfully Camel module for the one series, currently working, couple of awesome guys working on porting that, so if you support a Camel integration you essentially get like hundred different protocols that you can inter-op with as you get this huge integration opportunity.
So that is something we want to do, we want to also focus on persistence, that is something we did tackle pre-1.0, we sort of did a shim-mesh with data structures, STM and NoSQL databases, but the problem was that you could not really rely on any semantics for the NoSQL databases, so it was really the guarantees you can provide was so weak that it didn’t really make sense, we ending up cutting that out. But right now we don’t really do persistence, but in some cases you might want to have a very easy way how to do persistence with Akka, instead of saying: "Use wherever persistence you want" because if you are new to things you might want to have sort of an Akka way of doing it because it harmonizes with the way you learn or how you are thinking.
That is also something that we want to address, so scaling out in the cluster we already have remoting with 2.0 but we really want to make clustering, so there is a lot of cool stuff we are thinking of doing like, I talked earlier about the guardian that sits on top of the hierarchy, so when you have a cluster if you elect a leader among your nodes that becomes a sort of a super guardian, if one of the guardians fail you can always propagate that failure to the super guardian, and the super guardian will be the arbiter to say what's to be done with that entire node, so we try to see what we can do in a clustered setting to increase the fault tolerance.
A lot of interesting possibilities come from the fact that you have a cluster, so right now we try to make sure that the cluster is in place so we can build all this cool stuff on top of that. So the second layer of Akka is like, the first layer is stuff we do for 2.0 and the cluster will be the second layer that we can build even more cool stuff on top of. So it’s sort of like we’ve come very far in refining what we do in a VM and now we really want to refine what we can do in a distributed setting, that is the path forward now. Akka 3.0 is sort of completely, for me at least, foggy, it would be something really new, something completely different because the foundation that we have right now and the feedback we get from people is just so amazing that it's like: "We have something that works really well right now, take the feedback in", what people feelis good, what people feel is bad, we fixed that, but really feel like this is something that works, we can build on top of this and we don’t really need to make a third, we can just add stuff.
We focused a lot on in 2.0 to make it extensible as well, so for example we have something called Typed Actors which essentially is a JDK dynamic proxy that forwards the method call, as a message to an actor and then replies. So essentially taking an interface and making it asynchronous for some operations, if you have a strict return type, you obviously need to block to get the value out, but if you just have a void method or a Future method it can return instantly. For example Typed Actors they were a complete separate code path in the 1-x series, so I think it was five thousand lines of code just for the Typed Actors and in 2.0 it's just an extension of Akka, so we have something called Akka Extensions which makes it very easy to build on top of Akka and provide new abstractions. So sort of dog fooding , by doing Typed Actors I mean or entire serialization strategies also an extension, serial MQ stuff is also an extension.
So we are trying to make sure that we have a good story for not having to do a rewrite, so it’s way easier if you just can publish an extension, people can just try it out and get feedback, and if is really good we can pull it into the library, so that is what we are trying to do for 2.0, but I think 2.0 or the 2.x series will last a bit longer than the 1-x series. But it will be very cool if we came up with something that would warrant an Akka 3.
We spent almost a year working around the clock essentially and I’m very, very pleased with what we’ve managed to achieve, I have an amazing team of people so it’s sort of bliss, I mean we released 2.0 after like a year, a yearlong crunch and it took forty days almost, I think was thirty eight days before we wanted to ship like a maintenance, to fix a couple of bugs and increase performance and some stuff. Feels like we did a really good job with 2.0, people did a really good job and testing the RCs and it’s really good, I have a really good feeling about 2.0
I think so, I hope so, what Akka does that a lot of Scala APIs don’t do it, is that Akka tries to provide as rich Java API as Scala API, because we don’t really want to, we want to be open so whenever you support Java, you essentially support all other languages on the JVM, because Java is sort of a lowest common denominator, right?
Akka is not one hundred percent Scala because some constructs you can’t really express in Scala for performance reasons, so static fields for example, so when you need to do some tricky stuff with concurrency you need to have static fields, so I think it’s like ninety percent Scala and then we have some Java stuff just because we have to. So coming back to why we use Scala, it's just an amazing language, last week I think I did a full implementation of the actor model in less than thirty lines of Scala, and whenever I go back to Java, because I did Java like ten years, it just amazes me how much code I need to write before I focus on the thing I wanted to do.
So it’s like, for me at least sometimes it feels like I have to just before I start thinking about the problem, I know I’m going to need a class, and I know I’m going to need these fields, all these things that come for free with Scala, the case classes are amazing, it’s just a language that grows on you, I think it took me about three months from going from Java to Scala before I had this sort of epiphany moment where is like: "It all falls into place", so it was sort of frustrating in the beginning, but this was back in Scala 2.6 or something, sort of a different language back in, not different, but the tooling was way worse and there weren't any books, it was just harder to learn back then.
Now we have an amazing area of Scala books and we have multiple IDEs that are really nice now, and Scala just grows on me, I can express what I want to do without having all this, my old boss used to say: "Why do I have to write Person person = new Persion()", it's quite a a bit of redundancy in the sentence. So Scala makes me productive and it also makes me appreciate my code more, because I feel like it’s easier to debug because I have less code, which means that I know that the code is smaller, the place where the bug can be is smaller. And also it gives me traits, I use them immensely, I love traits, makes me able to separate concerns without doing nasty stuff with interfaces.
Yes, it’s a sane version, it’s solves the diamond inheritance problem by linearizing the order that the traits are mixed in, so you know the order of things is deterministic. I love traits, it works really well, and also the functional aspects of Scala with the functional literals, closures, having a really rich type system that makes me able to express constraints where I declare the data type, for example with Futures they're covariant, if I were to express that in Java is like completely different story right, and just Scala makes it very easy for the user of an API, of course it depends of the guy who wrote it, you can probably have some sadistic implementer, but sort of tries to put the burden of implementing things on the implementer instead of the user, and I think that pays off because using API’s, is way more frequent than writing the API’s. With the rich type system you get the help from the compiler when you are thinking wrong, right. I love Scala for many reasons but mainly just because it makes me happy and just works. I could not imagine programming with another language right now.
I think the first frustration was unlearning Java, so I had a lot of preconceived things, that was just a legacy for me doing ten years of Java, and before I could let that go, I sort of tried to push Scala where it didn’t really shine, and also inverting the type declaration it’s a very simple thing, but can be a bit awkward the first moment, you have to just learn that you do it in a different way. The Monadic stuff was also the epiphany moment, because Monads are so frequent in Scala, you use Optional all the time, use lists all the time, they're so prevalent, but they're also not obvious, you use the map methods but you're not saying that I want to use a Monad.
So before you understand that there is a commonality shared between different things and how you can take advantage of that, you are struggling to understand what are you supposed to do, I think the epiphany moment was when you realize that high order , passing in functions, and this was actually something that I use to do a lot in C, so I would pass function pointers around, so when I just stopped looking at Java and how did I do in C way back, yes, this connects with me again. It is like I find my commonality, because when you do an anonymous inner class in Java, it does not feel like a function passing style thing, because it's all that ritual and it's just one line of business end code. For me it was connecting the dots between what I use to do in C and what I was doing in Scala, but I guess it’s different for everybody, right.
So if I were to describe myself as a Monad, I don’t know if there is an ADD Monad, but I would probably be the ADD Monad that does a bit of everything all the time. It will be a some sort of weird Monad that does different stuff all the time, I could be a Future as well, I guess I’m a Future I’m a Future Monad.