Sylvan Clebsch on the Actor-Model Language Pony, Garbage Collection, Capabilities, Concurrency
Bio Sylvan Clebsch was kicked out of a lot of schools before becoming a serial entrepeneur in the 90s, working on embedded OSes, secure systems, VOIP, physical simulation, and graphics engines, before accidentally becoming an Executive Director in IT at a major investment bank - which he has now left. He now works at Causality, developing Pony, and is a part-time PhD student at Imperial College.
Code Mesh, the Alternative Programming Conference, focuses on promoting useful non-mainstream technologies to the software industry. The underlying theme is "the right tool for the job", as opposed to automatically choosing the tool at hand.
Now I’m sort of an academic. I’ve spent about 23 years programming in industry on things like video games, embedded systems, embedded operating systems, cryptography, military systems, and most recently, financial systems. Now I’m, late in life, doing a PhD at Imperial College with Sophia Drossopoulou, and designing a new programming language called Pony.
Well, we always have new programming languages. This isn’t anything new. Everyone always says, “Well, you can’t have a new programming language adopted,” but of course we always do. Recently C# and Java had huge explosions on the scene. But there’s also Ruby and Python, and Go is brilliant, Rust is excellent. And going further back, we had a huge transition into functional languages, and Caml and Haskell and OCaml.
These things always crop up. New languages crop up because we have new problems. And the big problem we have right now is that cores aren’t getting faster, but we’re getting more cores. So many-core concurrent programming is dominating a lot of the problems that we have in the industry when we’re writing scalable high performance systems.
Pony is an actor-model language. So if you’re familiar with Erlang, that’s also an actor-model language. There are lots of other interesting actor-model languages. Akka on top of Scala is an actor-model system, written more as a library than as a language. There’s the C++ actor framework, which adds actors to C++. There are several actor-model languages. The actor-model, basically, is a notion of the nature of computation, which sounds all highfalutin and complicated, but really, it’s very simple.
It’s this idea that an actor has state and behaviors, and that’s it. And an actor can send messages to itself and other actors. It can modify its state and it can create new actors. That’s it. And what this gives you is an incredibly powerful model of concurrent computation, whether that’s on a single node or in a distributed system. This was all done in, originally, 1973 by Carl Hewitt, although in 1985, there is Gul Agha's thesis which provided a formal model that really allowed us to start writing serious actor-model languages.
Well, all languages are Turing complete, right? You can do it all. There’s no language that can’t... well, I suppose there are some languages that can’t do any thing. Certainly, Agda and Idris can prove termination and we can’t do that in other languages. But what Pony really brings compared to other actor-model languages is the idea that you need to take advantage of every subsystem on a computer for performance.
The main thing with that is memory. Memory drives performance right now. It’s not really core speed that’s holding us back. Obviously, faster cores are great, but with a memory bottleneck, it might not do you any good. So probably, the most interesting stuff in Pony with this is its notion of what’s called reference capabilities in Pony, which is some new academic work that we’ve done. And this is the idea that you can annotate a type to indicate whether it’s safe to pass or share with other actors. And we can prove at compile time these safety properties. This lets you pass things by pointer, by reference, to other actors instead of copying, which is the usual approach in actor model languages.
This means you have direct memory access to things that drastically lowers the overhead. But it also means it’s an entirely lock-free language. There are no locks in the runtime. There are no locks in the language. This eliminates a whole class of performance problems. And that’s fun. The only thing better than fast code is no code at all. So by proving things at compile time, we can simply elide runtime checks that happen in other languages. And one of the most fun things this lets us do is change the nature of garbage collection in Pony. So we have a garbage collector that never does thread coordination, no stop-the-world.
Interviewer: [0:04:37] How does it work?
Sylvan: Each actor collects its own heap, things that it references. And that’s great. You can allocate it on its own heap. It can collect its own heap. You don’t need a separate thread to collect it because that introduces cache issues. And that’s good, but of course we can share these pointers across actors. So we need a protocol that allows us to safely know when something is referenced by another actor, or maybe just in a message somewhere that hasn’t even been delivered to an actor. But we need to do this without locks, without synchronization, and without roundtrips, messaging roundtrips in most actors because all of these things introduce the equivalent of blocking mechanism.
So this is a protocol called Orca. It’s inside the Pony runtime. What this does is it’s a deferred, distributed, weighted reference counting mechanism for objects that are shared amongst actors. But that’s a little confusing because it’s not a reference counting mechanism, not in a way that people think about it. It doesn’t have anything to do with the shape of the heap. If I assign something to some variable in my reachable set, it doesn’t change the reference count. It’s only about how many times have I received an object in a message. And that is what adjusts the local reference counts and foreign reference counts and lets coordination happen purely by aggregated, deferred message passing instead of synchronization. There is a good paper on it.
Werner: What’s the title?
I believe the title [of the paper] is Orca, Object Collection in Pony but it might not be quite that. There is also a paper on collecting the actors themselves. So, it’s unlike most actor-model languages where you have to manually terminate an actor with a poison pill, or perhaps a system call. Actors in Pony are themselves garbage collected with an extension of this protocol that can determine not only that an actor’s message queue is empty but that it will remain empty for the lifetime of the program, and therefore, it’s safe to collect those things.
Yes. It’s a special class of object. Just like you’d say class such and such. In Pony, you can do that. You can also say actor such and such. And all that does is it adds a 240-byte overhead to the object, which is the ability to have a queue, the ability to have your own heap, the ability to do garbage collection, and this gets added onto the object, and that allows each of these actors to be independently scheduled.
The underlying scheduler in Pony uses single-producer, multi-consumer queues of actors that have pending work. So a scheduler thread, let’s say an actor comes into being on some scheduler thread and it creates a new actor and sends it a message. But when you create a new actor, it’s not scheduled anywhere because it doesn’t have any pending work. But you send it a message. Since it’s not scheduled anywhere, what you do is you schedule it on the same scheduler thread as the actor who sent the message because you want to take advantage of cache locality for the contents of the message.
But if it were already scheduled, you would leave it scheduled where it is, because there you want to take advantage of the cache locality of the working set, and the message is probably smaller than the working set of the actor. But then the threads themselves, when a thread has no work to do, it needs to be able to steal a pending actor, an actor with work to do from some other scheduling queue. This is a straightforward implementation of work-stealing schedulers. Straightforward maybe isn’t quite the right word, but other languages do this and other queuing systems do this.
The fun part is that these actors have no CPU or other runtime overhead when they’re not actively performing work. There’s a little bit of memory overhead just for them being in a scheduler queue if they have work pending, but if they have no pending work at all, there’s nothing. There’s no overhead at all. It’s O(0) on that count, which is fun because it means along with the 240-byte overhead which is a very small memory overhead, you can have millions of actors in a production system. In fact, it’s quite normal to have tens of millions of actors in it in a production system.
And of course if you have lots of actors, what you need is really fast messaging. That’s been a big concern in the Pony runtime is optimizing that. So we have mailboxes that are a single atomic operation without a CAS loop. So it’s wait-free and lock-free to post the message to an actor. And that includes on the same atomic operation what’s called previously empty detection, knowing that it didn’t have work, so now we need to schedule it. And it’s zero atomic operations with no loop to pull a message off of the mailbox. So that’s nice.
The other side of that are the scheduler queues where it’s zero atomic operations to schedule something on a scheduler queue with no looping. Of course, taking something off a scheduler queue because we have work stealing is a CAS loop. That’s a double word CAS loop. Still, it’s a lock-free algorithm but it’s not wait-free. You can have contention on that. In a real world system, because of the nature of how work stealing works, basically, you’re going to max out at two threads contending on something. That’s only when one of those threads has no work to perform, so that’s pretty good.
Werner: You managed to really cut down on any sort of coordination between, or most coordination, between cores.
The only coordination that happens between cores is on message passing itself. Really, it’s only the atomic operations that I’ve just described. That’s the entire set of things that get coordinated on when you write a Pony program, which is pretty nice. It means that a lot of the problems that we tend to deal with at the application level are actually now handled at the runtime level. But it does mean you have a very different style of writing programs.
You’re not going to have shared mutable data structures. You’re not going to have a concurrent hash map where you have lots of different threads modifying the same hash map. That’s just not allowed in Pony. So you tend to write very different programs where data flows through a set of actors, or sometimes an actor holds a set of data and code flows through that actor. For example, you can send a lambda through an actor to operate on the data that the actor holds, or you could send data across actors, even mutable data across actors to allow a sequence of mutation before, let’s say, an order goes to the network in a financial system.
Yes. So far so good is what I can say. “Massive amount of cores” is a term I’m uncomfortable with.
Werner: Let’s say a thousand.
We haven’t run on anything near a thousand cores. We haven’t had access to the hardware yet. We’re hoping to have access to a 4,000 core box with two terabytes of memory to do some tests on. Right now, the main problem with boxes like that is getting uncontended time on a box like that. But we certainly run on 64 core boxes with a terabyte of memory and had no issues. In fact, we basically scale linearly across the core count. Obviously, the runtime is NUMA-aware so we’re very careful about how we deal with NUMA nodes on a box that size.
The results right now are great. I mean our benchmark numbers are great. All benchmarks are snake oil. So never believe benchmark numbers, especially mine. But so far, they’re great. We’re beating not only other actor-model languages but we’re beating Java, and even in some cases, C and C++ for execution speed on things.
Yes, that’s a huge part of it, absolutely. Another part of it is the nature of algorithms you tend to write in an actor-model language. That’s true for all actor-model languages that -- it encourages writing algorithms that naturally take advantage of how modern machines, modern parallel machines work, where a lot of times, if you present it with an O(n^2) problem and you write it in the actor-model, it’s not an O(n^2) problem. You write your dataflow differently.
Of course all of this can be expressed in any language. Of course you can write all of this in C. You could write all of this in Java. But having a language that provides these things as first class concepts and provides enough of a type system, actually, Pony has a very powerful functional type system with algebraic data types and generics, tuples, intersection types, discriminated union types, polymorphic functions, all this stuff, high order functions. When you have that mechanism around to support you, now you can write this fast code, but you can do it in a way where the compiler gives you a lot more guarantees, and the developer isn’t struggling at every stage with correctness.
8. If someone was going to look at Pony, what other language experience would help them? So if they’re coming from Erlang, do they have an easier time with Pony or if they come from other functional languages?
Sure. I think that’s an interesting question. I think Erlang experience certainly gives you a strong grasp of the actor-model and how that works. Similarly, Akka experience gives you a strong grasp of the actor-model. I’m a big fan of Erlang and a big fan of Akka. I’m a big fan of a lot of languages. I’m not a real believer in competition amongst languages. I’m much more a believer in picking the right tool for the job.
But yes, in terms of other experience, certainly, functional programming, Haskell or OCaml will make the type system more understandable. On the other hand, it’s also inherently an object-oriented language. So with a syntax that’s not dissimilar maybe from Python or Scala, although it’s not whitespace-sensitive, well it depends. Some people like that.
Actually, even a Python programmer, in fact, we have this experience. Python programmers tend to take pretty well to Pony because they treat each actor, essentially, as a composable program which is very natural in the actor-model. Then they just start composing actors to get what they want, which turns out is a great way to write code. So I think the people that suffer the most are the people that expect it to work like C# or Java, perhaps, C++, although C++ programmers tend to be flexible.
Werner: People that worry about the cost of a thread, basically, ie. who think “I can only have a thousand threads and then things started to fall over”.
Yes, or who think about concurrency from a threaded point of view, who think about concurrency from a shared mutable data point of view, which are things that -- they’re not merely discouraged by Pony. They’re disallowed.
Werner: You have to really embrace the message passing thing?
9. You already mentioned all the various ways describe the type system. So, can you give an idea of – obviously our audience can look at the language, but is it closer to Haskell or OCaml or the Java type system?
Yes. That’s an excellent question. I think it depends on the part of the type system you’re looking at. When you look at the thing that says, “Class Foo,” which is generic on some set of type parameters, that looks pretty Java-ish. On the other hand, when you look at the part where we have intersection types and union types and tuples and those are all completely composable, it looks pretty Haskell and OCaml-ish. On the other hand, when you look at the fact that Pony supports structural subtyping, that looks a lot like Go. But Pony also supports nominal subtyping, and maybe that looks more like Java.
I think the type system is, actually, amongst those features, pretty easy for most people. We tend to use structural types a lot when you don’t care about accidental subtyping which is pretty much always, which makes writing Pony code when you’re not developing, say a core library like collections. When you’re writing actual application code, often, it feels more like a dynamic language; much like Go does in a sense that “It’s close enough. That’s pretty much that type. The compiler says, 'It’s good enough'. Good. Let’s just go.” And of course, it has local type inference. You only actually are specifying types at the signature boundary.
The trickiest part is definitely the reference capabilities for programmers, this ability to prove isolation, prove immutability. That takes a little while to get used to. It tends to be, this is entirely anecdotal, no evidence for this, whatsoever, but it seems to be about a two-week process for people where they’re jumping up into Pony and saying, “Oh great, I’m going to write some code”, but the compiler says no.
It can be a little frustrating at first when you start to say, “Why do I need all these annotations? I know what I’m doing.” But the point of the compiler is to say, “We understand that you know what you’re doing. This is already how programmers think about concurrent code. They already understand which thread owns data and can freely mutate it, which data is immutable so we know we can all read from it.” And these concepts of ownership and sharing, they’re inherent in how professional programmers write code no matter what language that you’re using.
But Pony gives a type system that allows you to prove it to the compiler, which means that instead of just saying, “Programmer asserts this is true. Let’s hope we don’t crash” or “Programmer assert this is true but now we need a heavy runtime mechanism to prove it at runtime.” We can prove it at the compiler level, and then we don’t have to do anything at runtime, but we know the code is safe.
Werner: Basically, the program always kind of knows ownership but it’s not that explicit. Often, it’s easy to cheat and sort of not think about it and say, “Oh yes, that’s fine.” If I get a Segfault, “Oh yes, I should have put a lock there or something.”
Exactly. But when you start expressing it in a type system, again, ideas, we already all have, it guides your design. In a sense, you say, “You know what? I don’t really care about ever writing to that again. I’ll just make it immutable, fantastic. My personal bent, with a background in functional programming, is “Always use immutability unless you have a reason not to”, and basically, that reason is performance. Maybe there’s something else. Could be.
It is. It’s derived from a concept called object capabilities, which is a brilliant chunk of research that began in the ‘60s at the operating system level about saying, “Access control lists are just wrong. They have all these problems, and what we really want is an unforgeable token that gives us the right to perform certain operations on certain data so that the concept to the operation and the data is tied together.”
That work on capabilities was extended, eventually, into this idea of object capabilities. Funnily enough, that thing that combines functions and data, we know what that is, that’s an object, and so that a reference to an object is the ability to perform operations on that object. That’s a lot of work by people like Mark Miller or Tom Van Cutsem or Jonathan Shapiro. Particularly, there are some brilliant object capabilities languages out there like E and AmbientTalk. Those are actor-model, object capability, secure language, fantastic, good stuff.
The idea of reference capabilities is to say there’s something else that we want to express about this reference, not the object but the reference to the object, which is inherently a capability’s notion, which is, are we able to read from it? Are we able to write to it? Can we pass it to another actor without keeping a reference? Can we share it with many actors? These are the kinds of things that we need to be able to express at the reference level in order to make capabilities, concurrency secure.
11. These individual capabilities like sharing or writing something, are they specified individually, ie. “Okay, I have an object of type string and that’s mutable and sharable”? Or do you have classes of those things? How does it look like?
Yes. It turns out we think we’ve discovered an underlying notion that we believe is more fundamental than the notions of isolation and immutability, and it’s based on something called deny reasoning. We just had a paper at AGERE (co-located with OOPSLA) a few weeks ago on this deny properties system for reference capabilities. The basic idea is if something is immutable, what we’re really doing is we’re denying the existence of local or global write aliases. There’s nothing about this reference to this thing, this alias. It’s about what other aliases can exist. If it’s immutable, we know there are no aliases that allow writing. That’s pretty powerful.
And in this context, local is something held by the same actor that had this current reference and globals held by some other actor. If something is isolated, we have a different set of deny properties. We know that locally and globally, no one else can read from or write to that object through some other alias. Now it’s isolated, but that’s a deny property. That’s pretty cool.
And then if we have an opaque alias, what we call a tag alias, something that doesn’t allow us to read from or write to the object but it allows us to understand its identity or send it an asynchronous message because maybe it’s an actor, that doesn’t deny anything, either locally or globally. So that means we had this interesting property that things that can be sent in messages are things that make the same guarantee globally as they make locally. It also means we have this property of this matrix of local and global deny properties, and it turns out there are some other cells in the matrix that are really useful like what we call a ref reference type. We call an immutable type a value type. These are straightforward type system terms.
So a reference type is basically like a Java object reference. I can mutate it. I can have aliases to it. I can do all kinds of things, but with an interesting guarantee, which is we know there are no global readable or writable aliases to this. So we know it is concurrency safe. No one else is going to be reading from or writing to it. So it’s fine for us to write to and read from it. And there’s another type which is a box type, which basically means a box type, a black box. We don’t know if it’s immutable or mutable, and we don’t care because we’re going to be reading from it, and that’s really useful in writing polymorphic code. I only want to write this code once that reads from this object, and I want to write it for every possible reference capability.
Or even algorithms where in one place, you need to be able to read from it but you want to make sure that place doesn’t modify it, even though you know it may be modified into some other part of the algorithm being able to have those guarantees is powerful even within a single actor. And then there’s another type which is a write uniqueness type, a type that is not unique for read but it’s unique for writing. That’s an interesting type. We have some use cases for it but I’m not sure it’s going to be all that useful with coding programmers in general.
Werner: What’s one example?
Well, one example is, let’s say you have a booking system where you’re booking a bunch of stuff but you’re free to mutate that until some finalization point where you’re saying, “Actually, we’re close enough to the bookings that now these things are immutable.” So you want to keep the ability to make them immutable later. So you want to make sure you have the only write alias to this thing.
But you also want to share out in the same actor, readable aliases to it because you have a single data structure that represents all your bookings, whether they’re in progress or finalized. Well, that’s okay because you can have the read alias to these things that are currently mutable but will in the future become immutable, which is a powerful concept. I’m not sure how many algorithms are going to need it, but it’s a fun concept.
12. Definitely. I think we already mentioned what kind of experience users have learning the language, the big stumbling blocks. Have you found any others or have you found anything in the language that you think should be different or I should take a shortcut here?
One of the big ones, probably, where there’s a lot of pending work; one of the big ones is reflection. Pony doesn’t currently have a reflection API. Obviously, we could write one, but we want to make sure that we design a mirror-based capability-secure reflection API. So that’s going to be interesting. Right now, you have to write more code than you might want to for some tasks because there's no reflection API.
Or for example, right now, Pony does not allow types that are dependent on values. It will. That work is up coming. We’ve already designed it. But it has some interesting stuff about what’s a compile-time expression and can generate a value that a type can depend on and what isn’t. Obviously, we’re not talking about full Agda and Idris level dependent types. We’re not that cool, but just getting values as type parameter is already a huge step in terms of what we can do.
And there’s other work too, like metaprogramming. Pony doesn’t have a macro system right now, and a lot of that is because we’re not sure what the right entry point for a macro system is. It’s not that I’m against metaprogramming. It’s great. But is hygienic macros enough? Maybe not. I think generated functions in Julia, for example, I think is a very exciting way to approach the idea of a metaprogramming system. So we’ll see what happens with that.
Werner: Okay. Lots to look forward to.
Yes. I think another thing I should probably talk about in terms of looking forward to is that actor-model languages are usually distributed, as well as concurrent. So you can run them on many nodes, not just one. Pony’s distributed runtime is vaporware right now. We have a prototype but it’s not publicly released and it won’t be for a while. We’re still working on it; Sebastian Blessing with his master thesis on how to do this. So we know how to do it but it’s not there yet. So if you’re expecting to be able to dive in and write distributed Pony straightaway, not yet, sorry.
I hope we just call it Pony because hopefully, doing work in a distributed context won’t be unusual.
Werner: It could be transparent in a way?
Yes. In fact, distributed Pony is a little bit different from other distributed actor-model languages, and that it allows the runtime to migrate actors itself, because we can garbage collect actors. We can also safely and usefully migrate actors, which is kind of interesting. It means that the same program with no changes runs on a single node or many nodes or many nodes with many cores, and the source code doesn’t have to be aware of the topology because the runtime is. So that’s kind of fun.
Ponylang.org, please. And please do more than check it out. We’re looking for people to write any kind of code at all. Do you want to hack on the compiler? Great! Do you want to work on type system theory? Fantastic! Do you want to write tooling? You just want to write little sample programs. You want to write documentation, anything. It’s obviously open source and we’ll be open source forever, and we’re eager for involvement. We think that there are so many good ideas out there in the community that we’re extremely pleased every time someone comes, and we’ve had so many good contributions already from our small but growing community.
I have to credit my friend, Nathan Mehl. Years ago, I’ve complained and complained about whatever programming language I have to be working at, and giving this laundry list of things that I wanted of a proper programming language. He just would say to me, “Yes, sure, and I want a Pony.”
Werner: That’s a great way to answer it, and thank you Sylvan.
Thank you very much.