BT

Felix Klock II on Parallel JavaScript
Recorded at:

Interview with Felix Klock II by Werner Schuster on Feb 10, 2014 |
21:36

Bio Felix Klock II (@pnkfelix) is a research engineer at Mozilla and works on enhancing the SpiderMonkey JS virtual machine, and also on the development of Rust, a safe, concurrent, and practical systems language being used at Mozilla to prototype experimental browser architectures (Servo).Prior to working at Mozilla, Felix worked at Adobe Systems on Tamarin, the VM for the ActionScript language.

Code Mesh London is an annual conference dedicated to non-mainstream technologies. In 2013 it featured talks from over 50 inventors and experts in languages, libraries, operating systems and technologies that handle the programming and business challenges of today. Programming languages discussed ranged from technologies that have been around for a while such as Haskell, Clojure or Erlang to new languages such as Elixir, Rust, Go and Julia.

   

1. We are here at CodeMesh 2013 in London I’m sitting here with Felix Klock II, so Felix who are you?

I currently work for Mozilla Research, I come from a background of working in compiler technology and also in language runtime technology, I’ve run the gamut from working on C and C++, Fortran compilers for an embedded tools company and then over to for my PhD work, I worked on Scheme runtime system implementations, JITs and specifically GC work for a Scheme implementation and I went from there to work on Flash and then I worked for Adobe, I worked on the Flash runtime, I was not using Flash, and then finally I’m now at Mozilla working on two projects for Mozilla Research, one being Parallel JavaScript which is part of the SpiderMonkey JavaScript Virtual Machine and also Rust, a research language that we are prototyping now.

   

2. Parallel JavaScript intrigues me, so we obviously know that JavaScript is single threaded, so how do you put the parallel in JavaScript?

That’s a really good question, there are a couple of different strategies that people have proposed for putting the parallel in JavaScript, for example just throwing in thread semantics and living with the damage that might cause in terms of nondeterministic behavior and having some sort of locking systems etc. My strategy for Parallel JavaScript is quite a bit different, so our strategy there is that we we add a certain set of methods to arrays for certain operations like map and reduce and filter and oftentimes the higher order function you pass in to those functions map and filter and so on, it often times is pure, something that actually there is no interference in between the different invocations and so hypothetically you could just fire off those invocations on separate cores on your system, and so our strategy for Parallel JavaScript is when a programmer thinks that their higher order function is pure and doesn’t mutate globally shared state and doesn’t try to do anything in terms of communication between those distinct invocations.

If the programmer believes that and opts into using the parallel variants of map and filter and reduce and so on, once they do that Parallel JavaScript the way it works is it dynamically compiles a different version of the higher order function that you passed in, one that it’s specialized for parallel execution in particular it uses the type information that it has gathered from the already run invocations of that routine in order to inform the JIT compiler but also it inserts essentially write barriers or what we call it write guards on every write that you make that it can't prove is to local state within the functions, so anything allocated on one this invocations gets allocated to a local arena associated with that thread, and so it’s safe to mutate local state of one of these higher order function invocations because you can’t expect JavaScript programmers to just abandon mutation, it’s built into the language definition, it would be ridiculous to try to do that, so instead it’s illegal to mutate things that are allocated locally and we dynamically check because the programmer made an assertion, when they opted in to use these variants of these routines, they said: “I believe my function is pure” so we add these write guards that actually check that it doesn’t attempt to write to any global state and if you violate that guarantee we bail out of the attempted parallel execution, roll back the state and rerun sequentially, so the end effect is that if the programmer was right, their programs run fast and make use of multiple cores.

If the programmer was wrong and chose poorly, they don’t get crashes, they don’t get deadlocked, they don’t get all the kinds of hard stories that you end up with typical threading models. If they chose poorly all that happens is we might attempt to run in parallel, have to give up and rerun in sequential, so it will be slower than if they had chosen correctly to say: “Oh no, my function is impure, lets run the sequential variant”. So that is the basic strategy and how we are trying to add parallelism to JavaScript, the important part of the story there is that we want to keep the semantics on the web the way they currently are in that, yes, there are sources of nondeterminism because it’s an event based world and I/O is coming in, but the programming model in JavaScript is to make all that explicit in terms of having to register callbacks on events. We don’t want to start putting the nondeterminism into the things that look like they are running sequentially, we want to keep the metal model alive that you actually get deterministic semantics as best as we can.

Werner: The single threaded semantics are quite nice, I mean coming from Java it’s always, so it’s nice to be able to know that this variable is not going to change unless I change it, in a multi threaded world anybody can fiddle with whatever. So it’s definitely a nice model, and I like this idea of this transparent fallback, so is there also in this model, is that do we have reporting for these fallbacks, when I program how do I know that I haven’t accidentally …

That is a really, really great question, so we acknowledge upfront, we’ve know in the research team that an obvious problem with this model is if someone just starts or someone reads a blog post or a skims a blog post about this technology and just starts peppering their code with calls to the parallel routines without any thought as to whether it actually is a pure routine that they calling or not, they are going to see slowdowns and we know that we will need to have some sort of feedback mechanism that help programmers diagnose when routines have to fall back on sequential execution and more importantly you know, you are just saying: “We had to fall back to sequential execution for some invocation” that isn’t good enough on it’s own because the semantics of JavaScript are very funny in terms of what operations can actually cause some sort of impure behavior, certain kinds of String operations or even certain kinds of lookup operations on objects can cause, you know because you have getters that kind of thing.

So it can be very unintuitive about what operation you did was deemed impure by the runtime system and we also wanted to make it easy for engine implementers to adopt, there is no way that if we just put this in Firefox that we can expect it to survive on the web unless we can get buy-in from other web browser vendors. So part of the goal here is also to make sure that the technology is flexible enough to be adapted to different browser implementation strategies. We want to make sure we have something that Apple can put in Safari, Google put into Chrome and so on, and so because of that we are deliberately leaving underspecified exactly what pure means, so if you don’t want to bother making your regular expressions implementation threadsafe, so it can be invoked by different threads, by Parallel JavaScript, that is ok, just treat it like any invocation of a regular expression operation as something that is impure and it’s going to cause sequential fallback, and then the runtime system will just fall back to sequential mode, but because of that it’s going to be super important to have a good debugging support to provide the feedback about where your failure occurred and what the call stack was, if possible, and why, what did I do in this code that caused Parallel JavaScript to bail out on the attempt.

But unfortunately we don’t have that tool in place yet, we are working on it, we have three researchers at Mozilla working at this project, some part time, but there is been some effort at least try to figure out what needs to change in our debugger to better support this. We aren’t at the point yet where we provide the kind of integrated tooling that we think will be necessarily in the end, but that goal would be putting the cart before the horse trying to implement that, it’s important first to just even make sure that we can implement the technology, get the performance and prove to the other browser vendors that there is a demand for something like this in the community or that there's a desire for it.

   

3. So what data types does this work on? Is it Typed Arrays mostly, can you use anything else?

So the original implementation, this was work that was first done at Intel and we are doing this work with Intel, they provided us with a prototype to use and so originally it was designed to work on arrays and on certain variants of arrays like I think the Canvas, things that are array-like certain kinds of image structures to be able to process them, process pixels in parallel, that kind of thing, but we are working on generalizing it now because we found that at least with the implementation strategy that we chose, when you have the series, so the model here is that we are reusing existing JavaScript APIs for map and filter and reduce and so we describe a parallel computation as a series of transformations on array and thus you might have a map of a filter of a reduce or the other way around, I guess, but a pipeline of some form and the point here is that if you do that naively you end up allocating intermediate arrays for each of those intermediate values there.

And we found this was clearly a problem in terms of overheads and not in the kind of efficiency that we were hoping to see, and there is a couple of different ways of working around this problem, you can try to make your JIT compiler smarter, but we are trying to get a different approach of taking what was already out in the world in terms of Typed Arrays and generalizing it slightly there already was work being done independently of Parallel JavaScript to add something called Binary Data where they wanted a more flexible way to describe basically this structure of streams of information and make it easier for people to get the same kinds of layout control of both the objects they allocate and also they have a correspondence between their representation in memory and what it looks likes on a stream. I’m not an expert on the Binary Data specification so I don’t know if we’ve got actually got to the point of making it easy to parse these things off a stream but I have to believe that's somewhat likely, in any case the point here is that when we saw the Binary Data Spec we saw that they took Typed Arrays and said: “You know what? Let’s make something that just doesn’t have array of int and array of float and all these things, let’s actually have some structured data have arrays of these structured data or structures of arrays and so on, and basically giving you something much like a C style type system for describing how memory is laid out, and when we saw this, we saw an opportunity for Parallel JavaScript in terms of saying: “Look, they already have an array type in this type expression language that they are building up for Binary Data”, if we take the methods that we are already planning on adding to array and put them on Binary Data arrays as well, then we’ve got the benefits of having extra guarantees, the programmers have the benefit of having extra guarantees of knowing how the memory is actually laid out, you wouldn’t have to deal with dynamic interpretations of whether this word in the array is a JavaScript object or if it’s all intss, that kind of thing.

You have some guarantees in terms of saying I allocated an array of ints, is just like Typed Arrays in that way. Those benefits that the programmers see, they feed back into our JIT compiler in terms of being able to do better jobs, the better job compiling Parallel JavaScript routines that are operating on Typed Arrays and we also, really cool and exciting when we working on, is basically adding a way to, adding as part of the API a way to say: “I’ve allocated this much storage space and now I’m going to fire off a bunch of parallel threads and I’m going to give each one essentially an out pointer that points into the target storage that we are trying to get into and each one of those threads is going to run in parallel to each other and each is going to be able to write into their portion of the output array and this is something that we couldn’t have done, it wasn’t feasible to put that kind of API into normal JavaScript arrays, we considered it and tried to figure out ways to make it make sense but it really wasn’t feasible, but for things when you are working on matrices of data where you want to work on whole rows of a matrix at a time, this kind of structure makes sense in terms of providing this out pointers that represent a row of the matrix when we do computations on those.

That was a very long answer to the question, the short answer is yes, it doesn’t works on just arrays, there is a family of Data Types, there all are array-like because our interest here is on array-like computations, they are the ones that offer the most obvious low hanging fruit for parallelization, it’s a perfect match and I wouldn’t try, well I've considered trying to go beyond it and talk with my collegues, talking about going beyond that but for now we think array-like data is a great place to focus especially since there is so many potential avenues. Right now we are working on multicore compilation to target multiple cores but also there is obviously the potential for targeting a CPU if you happen to write your code the right way, so there is a lot of interesting places to go with this work in that sense.

   

4. Obviously our audience is now very interested in learning more about it or maybe even using it, so CodeMesh takes place in late 2013, where can people go, or what is the current status?

That is a great question, so the current status is that we have it, it is in Firefox on nightly builds, it’s one of these things our model for development in Firefox, is we try to do our work on trunk, for a long time in Parallel JavaScript wasn’t being done on the trunk it was working on a branch somewhere else, but we put in the effort actually get it into the Firefox Mozilla inbound tree, so that means that the nightly builds that are for beta testing have this capability already available, you can run the programs, the demo programs that are described on our website in the nightly builds. It’s not turned on for the release builds yet basically because we don’t like to add features that have not been standardized unless there is a lot of demand for them, for example if we see something that is already being implemented by browser vendors, well I can really speak for Mozilla, I know exactly what is the policy here. The point is that I don’t think we really like to go out and put something in unless we’ve got a serious reason to do so and this case we would prefer to work through the standards body of ES6, the ECMAScript committee and get everybody else on board because for this technology I don’t think it would be adopted if it were only in Firefox, I don’t see people putting conditional code into their programs saying: “If I happen to be on Firefox let’s use this other crazy system”, it doesn’t make sense to me.

There is some capabilities, certain kinds of features that you do that for but not I don’t think for this, instead our goal is to get it standardized. So there is both information on the ES6, they have specifications, strong specifications for what the API is and that is always undergoing revision, although I’m hoping that we're finally narrowing in on what we are going to actually decide on soon, so if you search for Parallel JavaScript and if you search for RiverTrail, that was the name of the project as it was known at Intel, and that might be an easier thing to google for, RiverTrail, you’ll find both the Strawman specification, that is associated with the ES6 committee and that's on a Wiki somewhere and you’ll also find a GitHub Repository, both of Intel’s prototype of this and also our own and you will also, I believe, find a number of blog posts that my colleague Niko Matsakis has been posting about Parallel JavaScript as we’ve been working on it. So it’s lots of information there especially really good in depth information both about the implementation strategy as it is now and the kinds of issues that arise and also some wonderful thought experiments about where things could go and how we might change things, so those are all sources of information.

   

5. What is your current feeling of getting this adopted or you’ve probably talked to other browser vendors or some of the ES6 people, what is your feeling there?

I think the issue there is that people aren’t convinced, so some people are excited, some people are somewhat interested but I don’t think that I would say that any browser vendor has reacted saying: “Yes, of course this is the right thing, we must adopt it”. No one said that and in fact there is plenty of people in Mozilla who think this may not be the right direction to go. So there is question marks there, so it’s not clear that it’s going to be adopted but the reason I think why it’s unclear if it'll be adopted is because we haven’t done a good enough job of coming up with demos that actually illustrate the important use cases for this.

We have demos that illustrate things like Mandelbrot set rendering that is much faster but the problem with the demo like that is that there's other technologies like WebCL that offer the same benefits but even more effectively, they provide even greater speed ups and they are already pretty widely available and that is not a good way to motivate a technology like this where the whole point of this technology is that it's far more expressive because it actually gives you the ability to read from JavaScript objects that are shared amongst those multiple threads that I mentioned before, you weren't allowed to write to them but we don’t have read guards, you can go ahead and read from them and as long as those read operations don’t have hidden mutations to the state underneath the hood, you can go ahead and read stuff and so that's to me a huge benefit of this technology and we’ve had some discussions with some developers who have recognized that advantage and we just need to do a better job of following through with them and seeing about getting some better demos made. For example we’ve talked to the developer of Adblock Plus who said he was interested in some of the things that we are working on both with Binary Data and also a little bit with Parallel JavaScript, so that is an example of the kind of thing where I think that more involvement with the users would help us as long as it didn’t also distract us too much, I’m always a little nervous about trying to act like I can just go out and talk to everybody when of course that has a cost too.

Werner: So our audience has homework, check out Parallel JavaScript and help out Felix and his team, write cool demos, come on folks and well thank you Felix for telling us about this!

Thank you very much!

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT