José Valim on the Elixir Language, Concurrency, Iteration

1. We’re here at GOTOCon Aarhus 2014, we’re sitting here with TODO: Jose Valim; so, Jose, who are you?

I am the creator of the Elixir programming language and I am also the cofounder of TODO: Plataformatec which is the company that is investing in Elixir and allows me to work on it full time and bring the language forth.

2. What is Elixir?

If you go to our website, our description is that Elixir is a dynamic functional programming language that was designed for building scalable and maintainable application systems.

Werner: You built Elixir on the Erlang VM.

Yes.

3. What is the reason for doing that? Many languages are built on the JVM to get the reach, but the Erlang VM, what is the reason for doing the language there?

I’ll go a little bit back into the story because that is very important to the why. So, I am on the Rails core team and I was working on making Rails thread safe, the whole idea is that the Rails team knew that concurrency is becoming more and more important so you want to be able to get a Rails application deployed on a machine with four to eight cores and just work. And the whole idea of making Rails thread safe, which now when I stop and look back it’s interesting because saying it’s thread safe is saying it’s not going to crash when you use threads, when you use a tool to leverage concurrency and get all the multiple cores running your Rails application.

So I was working on this particular scenario and whoever worked with threads knows it’s hard to guarantee something is thread safe, they cannot guarantee that’s not going to crash under some particular conditions, usually the bugs are very hard to reproduce and hard to fix, too, so in general it was a very frustrating work and experience and I ended up stopping and saying if concurrency is becoming more and more important, I need to find good solutions to solve these problems, I don’t want to work with those tools and abstractions that I have here for the rest of my life and I knew there was a category of programming languages that supposedly solved concurrency issues really well, which are the functional programming languages. So I end up going to study and learn more about different programming languages, Haskell, Clojure.

Eventually I found Erlang and I really, really liked, I say I found love of the Erlang virtual machine because the realization to me is that the Erlang virtual machine at this point is almost three decades old and when they designed it, nobody was working with concurrency, so what they wanted from the Erlang language, the virtual machine and the runtime, they wanted to write distributed software. So you have your code running this machine, some other code running this machine and they solve a problem, and then when concurrency became a thing they realized concurrency is just a special case of distribution, it’s just an easier case actually because everything is running on the same machine and eventually it became very good for doing concurrency tasks too. So to me that was quite amazing because it was like solving the long term, solving the big problem, the elephant in the room, they took the elephant out of the room.

So that was very interesting and then I started to use it more, develop a few things with it, and eventually the way I describe it is I liked everything I saw in Erlang, but I hated the things that I didn’t see, so eventually I came up with the idea of doing my own programming language that runs on the Erlang virtual machine that explores different things than Erlang does.

4. You mentioned there were things you didn’t see in Erlang. Is Erlang a language that hides certain things or that has certain magic in it to make things work?

It’s not about the magic, there are a couple of things. For example, I came mostly from Ruby, in this research when I looked at different programming languages, I really liked Clojure, too, both of those languages have metaprogramming as a feature, the whole idea that you get the language and extend it to other domains, that was one of the things that was missing to me at that point. Another thing was polymorphism, in Erlang most of the polymorphism we have is structural polymorphism when you are doing pattern matching, but this polymorphism is you need to pack everything inside a module, compile that module and ship it, you cannot actually extend it.

So if you provide a library and you are doing some pattern matching, I cannot get a library and say I want to handle those new types, for example; after you ship it, you shipped it. If I want to try a new type, I would have to fork the code, and that doesn’t scale in software development because if ten people are using your library and they have their own types or want to extend it, it doesn’t work. So we need a way to have a parametric polymorphism, where I say I define this protocol for you, we call them protocols but they are similar to type classes in Haskell, for example, conceptually, so if I say this is a protocol and my code can say “I am willing to work with any data structure that implements this protocol” and then any data structure can implement that protocol whenever it wants and we are going to do the proper dispatch at runtime. That is something else, I say polymorphism because I need to have good tools for polymorphism.

The interesting thing is that at this point I’ve been doing software development for ten years and half of this time I was when I was writing tools for developers, libraries and so on, in 2009 when I started I had projects in Ruby, in Rails at the time when these were getting some traction, so I was quite used to writing tools for other developers to use and that’s exactly when it hurt because I thought that was very important for me to continue doing the work I am used to today.

5. You already mentioned a few interesting features of Elixir, maybe we can step back a bit and just have a look at what’s the nature of Elixir. Is it a functional language, with just functions, do you have some sort of class system, you mentioned types, type classes?

It’s a functional language, if you are going very basic in terms of code, it’s a functional language, you package your code inside modules, but I don’t really like a lot the functional language description; in my talk that I am giving here at GOTO I say Elixir is a functional programming language, but more than a functional programming language, it’s a concurrent programming language, more than being concurrent is being distributed, and that’s one of the parts I like to focus on because when you come to Elixir you need to start to think how you design in terms of those processes, so I think that's the big difference, even with other functional programming languages. So I said process and it is very important to make it clear that what we mean by process is a lightweight thread of execution, not an operating system process.

Werner: An Erlang process.

Yes, they are the same as Erlang processes, for the virtual machine it’s exactly the same thing. So they are very cheap and they are isolated from each other, they don’t share anything which means I can have all of them running concurrently at the same time because they are isolated and every time they need to do any kind of coordination, exchange information, they send messages. So that’s one of the interesting aspects of the language when you come to it and so on, so that’s going more to the whole concurrency, distributed side, but if we step back and talk just as a language being functional, it’s a functional language with pattern matching, organized in modules we have, it’s a dynamic language so we don't have type types per se, but we have tagged types which you can tag some particular, we called them structs, and you can get information from them during runtime and implement protocols for them at specific behavior, so that’s some of the important features of the language. I also talked about metaprogramming, so we have a macrosystem similar to Lisp macros and one of the other features in the language which is really important is that we also have a very strong focus on tooling. And I think Go was a language that showed how tooling is important, the Go developers focused on tooling, very strong tooling since the beginning and that paid off really well because the developers can just come and have a very good setup to get started and start using it and I think sometimes even in functional programming languages they forget about this aspect, they focus on the language and on the things instead of “How do I get started?”, well “Just get this thing from over here, compile, maybe fiddle with this makefile and then you need to go elsewhere and configure an editor and if you want to have dependencies go download or search for this other thing”, so the experience of this “I install this thing and I want to start and be productive with it straight away” is very important too.

Werner: That’s the philosophy of Rails, it’s what made Rails so popular,...

Yes, it’s just a very good getting-started experience. I think it’s getting to the point that everyone is realizing that that's very important and we were able to increase awareness in functional programming in general because of it by improving the tooling. And it’s even more if you are going for something like if you want to look at Java as a programming language or .NET ones, they even do more, people are very used to the IDE and you actually just install the IDE and it’s going to install the language and everything and you don’t need to worry about it. So it’s very important to nail this experience.

6. You mentioned protocols; are they like Clojure protocols?

Yes. So I was saying I was missing a couple of things and at some point I said I want to have polymorphism, I didn’t know I wanted to have protocols, I knew I wanted to have polymorphism, so I went to see what other languages where doing, and it was similar to how interfaces work in Go, Clojure protocols in Clojure, type classes in Haskell, conceptually they are solving the same problem and I went with protocols because Elixir and Clojure are very similar in the sense that they are both dynamic languages, they are running on existing virtual machines, so when you go down you see the similarities and we have a lot of them, I say Clojure is one of the top three influences in the language and also I said I am going to go for the same naming as Clojure because instead of trying to come up with a whole different name or using something like type class which would cause confusion, even though we don’t have classes.

7. So where do your pattern matching features come from? Are they more from the Erlang side or somewhere else?

They come from the Erlang side, so we haven’t implemented pattern matching, we weren't looking for pattern matching papers and see the most efficient way of doing it, because we just compiled to an internal VM representation, internal compiler presentation for the Erlang compiler and it does everything for me, so I am just using the features that are there. It’s interesting to point out that most compilers as they compile they go through many stages and you have different representations that you could target when you are compiling your code and I chose a fairly high level presentation that is very close to Erlang because I wanted to have semantics in many cases as close to Erlang as possible, I think eventually when we have more, I am going to say confidence, that’s not exactly the word, but have more confidence in our compiler and where the language is going I think we can eventually compile something down the stack and get some features out of it. But at the beginning it was very important because when you are going to approach language design there are just so many options you can choose and directions you can go, it’s a puzzle, more like a Jenga, you need to be really careful how you are taking pieces and putting pieces. I thought at the beginning that I would also have this constraint that we are targeting some high level, it helps with productivity because if it wasn’t there, the amount of options would just multiply and then the language, we took two years and a half to reach 1.0, if I was at a lower level, I wouldn’t have probably released 1.0 yet.

8. So, metaprogramming, I think you mentioned a macrosystem, but Elixir has a syntax, it’s not like Lisp, it’s tricky to do, so how does that work?

That was a big part of the challenge because it was like polymorphism, I knew I wanted polymorphism but I didn’t know how, I wanted metaprogramming, I really liked Lisp macros but I said, I didn’t want to have a Lisp because I had already two other Lisps actually running on the virtual machine, so the challenge was how can I add macros with a “natural syntax”, so what we did is that the syntax is actually quite regular, so it’s not a Lisp in the sense that what you see is what you get, it’s not exactly, we have an AST but the translation from code to AST and AST to code is really, really straightforward and because the syntax is regular we also have quote and quasiquote and unquote as you find in Lisp and it just works because we kept the syntax simple. So, just an example, operators in Clojure use the prefix notation on top of the operator and then they do the operands, so we also started with that, that reflects in the AST eventually we said let’s add some syntactic sugar where you can have operators and the operator isn’t the usual notation between the operands. So we were adding syntax sugar in some points to get this natural syntax, so we kind of started with something very Lisp like and then we added sugar, we added some convenient notations and we ended up with what it is today.

9. Are you using macros to implement some of the syntax or is it a feature that developers can use?

If we go into the compiler we have the tokenizer, we have the parser that at this point it meets Elixir AST already so developers can write macros that are going to receive this AST or they can emit new AST or transform this AST. So this part today is implemented in Erlang and it’s for the compiler but the macrosystem is there so what happens is that if you go to Elixir codebase just 10% of the codebase, which is mostly the compiler and what we call special forms which are the same as special forms in Lisp, they are written in Erlang because those ones we need to bootstrap everything, but everything else in the language is implemented in Elixir itself and often with macros, so I said we have tagged types, tagged values, which are structs, they are implemented with macros, protocols are implemented with macros, a bunch of the constructs you have in the language like cond, if,... they are all implemented with macros, too, so it’s good that we were able to keep most of the language written in Elixir itself and it’s always nice to dogfood your API or your language.

10. Dogfooding is always essential. So, we have to get to the hipster side of functional programming. Last year I think at the excellent CodeMesh conference you were working on Iteratees and integrating them to the language and the library, so what’s the state of that?

At that point we had something that was similar to Clojure sequences or iterators, but iterators usually give the idea of being object oriented, but it’s basically the idea that if I want to traverse or map over a data structures, I would just call next, next, next until I am done and that had issues because one of the things that causes a problem is if you are working with a resource for example it’s the responsibility of the code that is calling next, next, next to open the resource and close the resource and do something when something goes wrong or something goes unexpected and this is bad because theoretically it should be the responsibility of the data structure that is iterating, if this data structure is a file or something that represents a file, it should be there and it should not delegate to the code that wants to do the whole iteration, so I knew that there was this particular issue that I wanted to solve because many languages they have if you are working with pure data structures that are only memory and so on, you need to use this library to traverse or map over them, but if you have I/O you need to use a completely abstraction. I didn’t want to go this way, I wanted to have one abstraction that could work for everything.

So I knew that sequences was not the way to go and this was not long after reducers from Clojure came out which solved exactly, instead of asking what is the next element of the structure, the whole idea of reducers is that I can pass the reduce function to the data structure and the data structure is going to do everything, so if you need to close a file when something goes wrong, it's always the responsibility of the data structure, the calling code doesn’t need to worry about it, and that’s good because if you are implementing map you need to worry about that, if you are implementing filter with next, you would always need to be worrying about handling our resources and now the responsibility of the data structure, it makes things really straightforward. The issue with reducers, though, is that they don’t allow core enumeration, they don’t allow something like zip, because I basically have a function that I give to the data structure and the data structure does everything, it doesn’t communicate back and I have a way to say start reducing and then stop and then do this and then do that. So I was thinking about this for a while and at CodeMesh I went to Jessica Kerr’s talk and she had exactly the description, it was like she had exactly what I had in my mind, what is the problem, where I am stuck right now, she said exactly that.

And then she said “So we have this thing called Iteratees” and she was talking about a Scala library, I think Scalaz in her talk, so it used Iteratees to solve those things. And I said great, that is the solution to all my problems, so I talked a lot with her to see how we could get that into Elixir, so today when you want to traverse a list or a file, an I/O resource in Elixir, you are going to use something based on Iteratees and it’s based because for those familiar with Iteratees you receive the collection, the thing you want to map over, reduce, traverse or whatever and a function and that’s it and every time you need to have a state, for example if you want to take five elements out of a collection you need to know how many elements you took so far, so what they do is all these states they are closures over these initial functions, just wrapped inside a closure, inside a closure, so on, and that’s quite expensive, so, what we did for Elixir, I am calling them Reducees because they are like Iteratees but you still have an explicit accumulator and with the accumulator you pass instructions, like I want to continue reducing or wait, I want to suspend this reducing and then you give back to me and then I am going to call back and we also implement core enumeration and everything with that, so that’s the solution we have in Elixir right now and it’s really interesting because we were able to get the good idea of reducers mixed with the good idea of Iteratees and come up with something that is efficient and we don’t lose expressive power, we can still do core enumeration, and so on.

Werner: So they are actually useful. Really useful.

Yes. Anybody using Elixir is definitely relying on those, and I believe it’s going to be increasingly more useful, right now you can have a file which has side effects as the source for example when you are enumerating or even streaming data, we don’t have operations that...; so let me step back a little bit, we have enumerables, any collection that you can traverse we call them enumerables and Elixir is an eager language, when you call map we are going to traverse that enumerable and give you a list, and we also have something called streams which is the ability to express all the computations lazily, so I can say I have this collection and I want to map over here to filter and do this and do that and at this point you didn’t do any computation and at the end if you want to say you had a collection with 1000 items and say I just want five, we are going to do those computations just for five and then we stop. So you have this whole idea of streams. And that’s really useful because you have the eager mode and the lazy mode those have very good use cases depending on what you are doing, but we don’t have any operation that uses side effects yet, so the source which is the data structure, your collection, it can be a file that has side effects or state, but we don’t have things like merge, for example if you have two files or two IO streams and they are sending data at different rates, I want to merge them so that requires also a state and side effect for me to know who is going to send me the data first and this interaction we want to explore in the future and it’s all being built on top of this foundation of the Iteratees, Reducees , hopefully as we continue evolving the language and bringing those constructs, we are actually having a lot of talks here about reactive streams, they are receiving data and they are extracting computation from this data that is coming from different places, it will become more interesting and we will see how good that model will expand, scale to those use cases.

Werner: Definitely streaming data or accessing streaming information or lists, there is a lot of research and work in the community right now on Iteratees and reducers and transducers and all these things, it’s interesting where this is going to go, also with the reactive people.

My theory is, and I think everyone is realizing that, that everyone is working on the same problem, sometimes they get to it from different perspectives, when they added reducers to Clojure, at least what they wrote is that they were thinking more about parallelism, we can express this computation and then they are going to do parallelism on top of our data and then you have the Rx folks, for example, working more on the data streaming and so on and I think that what everybody has started to realize is that they can use the same abstractions and that’s what we aim with Elixir to do both the data parallelism that you see in Clojure reducers, but also the pipeline which is the task parallelism where you express the computations at different stages and have your data going through. So there is a lot of research and people talking about it and I think we can have everything on top of these abstractions that are having different words, if you want to do this you have to do this particular thing or you can put everything into the same place it would be really, really nice.

11. This leads us to the inevitable question of what is your favorite monad?

I’ll go with async workflows in F#, they don’t call them monads.

Werner: But they are.

Because I really like the idea that you can express as a monad and is going to figure out who depends on what and it does a data flow style, I am going to calculate this async and just when I need it I am going to get the value and give it to you, and I don’t know if they added it to F# or it was just research, they also added kind of a match inside, those async expressions so sometimes you are seeing a pattern match with the async computation because sometimes you do two async requests and you care just about one of them, the one that arrives first, so how can you say that inside the async workflow and they come up with really interesting constructs were you can pattern match async workflows and if you say I care about this one and the other one I am using in another part, I don’t care about it, they compile that to something that you know, I just care about the first async workflow and when the other one comes and I got the first one, I can just cancel it and so on. So, it’s very interesting what they are building in that area, and concurrency is one of my big interests and it’s really interesting to see how they use the async monads and try to make it as expressive as possible.

12. Has that influenced you or some of your work in Elixir?

Not visibly, I thought a lot about adding something that would give me the same expressive power, so I could say do this async and so one, but we already have tasks which are a way to say I want you to execute this code and is going to execute that asynchronously, when I care about the result I just say “Give me that result otherwise it will block until result is available”, and that is good enough, C# and we have libraries for Clojure that do async/await, there is a lot of code mangling because you need to write everything into more like a callback style internally, we don’t need to do any of that, so we have a very good model already, and at this point I don’t think we need something that is going to be super, super expressive, because what we have naturally is already very good and expressive, we don’t need to go into a specific expression that has special powers, we always have this special powers already.

13. Well, that’s a lot of stuff to look at, for us, so where do we find Elixir?

Just google Elixir-lang, it’s Elixir-lang.org, our website, when you get there you have a getting started guide which is going to tell you the basic features of the language, pattern matching, structs, protocols, how the I/O system works just trying to give you an overview of what you can do there. We also have the advanced guide which goes into building an actual application with it, if I remember correctly it’s a very simple distributed key value store that you build in this advanced one, so it’s very interesting, we go into building the application, design your system with supervisors, that guarantees if something crashes it’s going to start a new thing in there, we also have plenty of links to books, screen casts on our homepage, so if you start with the getting started and then you want something more in depth or structured, there are plenty of resource there, too, and if people are looking for more interesting cases of Elixir in production we had Elixir Conf in July 2014, so it was not long ago, and we have very interesting talks about people using Elixir in production, so if you are interested in this production side and need help design a model or software, they can look at those talks. There is also my talk here which they are recording at GOTO, where I talk about how when you are using Elixir you approach designing and building applications and systems differently in a way that you can leverage distribution and fault tolerance and so on. So that is also good resource if you want also more general overview of the language and the foundation that we get from the virtual machine.

Werner: Ok. So I think we are all going to check it out, thank you, Jose.

Thank you.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

Bio

About the conference

This content is in the scalaz topic

Related Topics:

Sponsored Content

Related Editorial

Related Sponsored Content

Popular across InfoQ