BT

No Callbacks Required: StratifiedJS Returns Sequential Programming to Javascript

Posted by Werner Schuster on Dec 19, 2010 |

Javascript is fundamentally single threaded, no parallel threads in sight. Asynchronous programming is necessary for any task that might block; Javascript programming usually involves creating and passing around a lot of callbacks, essentially forcing the developer to manually translate sequential code into continuation passing style.

One solution was introduced at the Emerging Languages Camp at OSCON 2010: StratifiedJS. The language is essentially Javascript with a few more keywords and concurrency constructs which permit the developer to write sequential code. Yet it all still runs on normal Javascript engines that come in todays browsers.

How's that possible? InfoQ talked to Alexander Fritze, CTO of Onilabs, to find out. Onilabs is behind Apollo, a free and MIT licensed, browser based implementation of StratifiedJS. 

InfoQ: What is StratifiedJS?

SJS is an enhancement of the JS language for structured concurrent programming. It adds a small number of new constructs (waitfor/and, waitfor/and, waitfor()/resume, hold, spawn, using), and it extends the semantics of the constructs of the base JS language (i.e. sequencing, looping, conditionals, exceptions, etc) to work naturally with concurrent code paths.

SJS doesn't modify the semantics of the base language, i.e. a JS program compiled with SJS will execute just like expected (with the obvious caveat that the JS program must not use any of the SJS keywords as variable names).

For JS programmers the value that SJS brings is that is allows you to write asynchronous code in a sequential-like style, where the control flow is layed bare in your source code, rather than being obscured by a maze of callbacks.

We like to think of SJS as having the same relation to asynchronous programming, as structured programming (in the sense of http://en.wikipedia.org/wiki/Structured_programming) has to GOTO-style programming.

To illustrate the difference of JS and SJS, in normal JS, asynchronous tasks like an XMLHttpRequest waiting for a response from a server, or a call to setTimeout() require the caller to pass in a callback function which will be called when the action is complete. There is no way to synchronously suspend the current callstack until the task is complete (as this would block the whole browser UI).

In SJS, any asynchronous task can be converted into a synchronous, blocking one (with SJS's waitfor()/resume construct). You can then just take this synchronous task and use it with the normal JS constructs, e.g.:
if (detectLanguage(document) != "en")
  document = translateDocument(document);
 where both detectLanguage() and translateDocument() are functions that perform asynchronous queries to some servers (like google's translate service) under the hood.

On top of this, SJS adds keywords to coordinate several concurrent strands of computation (we call them 'strata' - hence 'stratified' javascript) in a modular, composable manner. E.g. performing a query simultaneously on google and yahoo, and returning the first one that comes in, would look something like this:
var result;
waitfor {
  result = performGoogleQuery(query);
}
or {
  result = performYahooQuery(query);
}
Not only does this return as soon as the first result comes in, but it also automatically cleans up the request that is still pending (by virtue of SJS's try/retract construct). I.e. if yahoo is still transmitting megabytes of data while google is finished, the connection to yahoo will be closed as soon as it is known that the result is not needed anymore.


InfoQ: Apollo is one runtime that can run StratifiedJS code by translating it to Javascript, this works on pretty much all browser JS engines. Are there other runtimes for StratifiedJS?

Yes, Apollo is a browser-based runtime for SJS. It's a small JS file that integrates 3 pieces: a code translator that can translate SJS to JS on the fly, a runtime for the generated code, and a module system that allows you to synchronously load SJS code modules across the net.

As for other runtimes, we're in the process of porting Apollo to node.js - we'll have news about this soon on our blog ( http://onilabs.com/blog).

We also have tentative plans for stratifying other languages; we've at least started thinking about how stratified versions of C++ or Go would look like.

InfoQ: How does Apollo implement blocking calls in StratifiedJS, eg. hold, sleep, etc?

Apollo basically takes over the management of the callstack. It rewrites the code into a form where it can 'remember' where to pick up execution again when an asynchronous task completes. On a very basic level, Apollo performs a continuation-passing transformation on the code (see e.g. http://en.wikipedia.org/wiki/Continuation-passing_style). The actual implementation is somewhat more complicated because SJS also needs to take care of things like exceptions and 'cancellation reverse-control flow' (as in the yahoo/google example above, where google returning earlier cancels the pending yahoo request).

InfoQ: Does Apollo have any specific requirements of a JS engine, eg. WebWorkers? Do you use setTimeout or setInterval - and if yes: how?

Apollo runs on any vanilla JS engine out there (maybe minus some bugs on browsers that we haven't tested). We do use setTimeout only for hold(). Apart from that, SJS does not require any concurrency mechanism (see also my elaboration wrt WebWorkers, etc below).

InfoQ: Is it necessary to create special stratified versions of APIs in order to avoid blocking? Are there things that can't be stratified, ie. calls that always block?

Any JS API can be used in SJS without modification. But if the API is asynchronous (i.e. takes a callback function), then it helps to put a small veneer on top to gain the full advantage of SJS. The idea is to take something that's nonblocking and make it block (yes, we explicitly want to make every asynchronous thing block in SJS). E.g. consider setTimeout(). We can make a blocking pause() function out of it like this:
function pause(t) {
  waitfor () { setTimeout(resume, t); }
}
We can now use this function in SJS like this:
console.log('foo');
pause(1000);
console.log('bar');
Here, first 'foo' will be written, followed after 1 second by 'bar'.

InfoQ: What concepts does StratifiedJS use or implement?

SJS started as a little experiment of bringing some of the concepts of the academic area of orchestration languages into JS. In particular it was heavily inspired by the 'Orc calculus' of the University of Texas, Austin (see http://orc.csres.utexas.edu/research.shtml).

SJS might have turned into something that doesn't really resemble Orc (and I think in terms of expressiveness SJS has a few features that are very difficult to express in orc - like try/retract), but what the two share is the idea of implementing a structured way of composing concurrent programs by building on a set of concurrency combinators.

InfoQ: What are the limitations in Apollo?

There are 3 main concerns that are often raised.

Speed: We often get an initial reaction where people think that compiling on the fly must be very heavy, but it is actually very fast, to the degree where even on mobile browsers the compilation times are pretty negligible for typical SJS scripts (e.g. on my old 2.4GHz Core 2 Duo MacBook, Chrome takes about 3ms to parse the SJS code in http://code.onilabs.com/0.9.1/demo/flickrcities.html (~100lines). On my Android phone it takes around 30ms.)

Interoperability: The concern is that people already have a bunch of JS code, e.g. many people use libraries like jquery or prototype - what happens if you add SJS into the mix?
 This is really not an issue, since, firstly, most JS code is also valid SJS code and doesn't alter its semantics (the situation is a bit like C and C++: You can take most C programs and just compile them with a C++ compiler and get the intended result). Secondly, the code generated by Apollo is fully interoperable with normal JS: you can just mix and match. SJS can call any JS function and vice versa (with the caveat that if you call a blocking SJS function from JS, the return value that you'll get is a continuation object; JS can't block - that's of course the reason for having SJS in the first place).

Debugging: A step-by-step debugger written for 'normal' JS is not particularly useful with SJS. In my opinion this is not as bad as it sounds, since normal debuggers are not very useful with asynchronous code anyway (because the stacktrace you get does not reflect the logical sequence of events). Having said that, we are working on a special SJS debugger, which will give you stacktraces that actually make sense and show you the full pending asynchronous dependency tree at every step.

InfoQ: What addition(s) to JS would help with StratifiedJS? Iterators, Coroutines, WebWorkers, Continuations, others?

Iterators, coroutines, continuations and other cooperative multitasking constructs *might* help, although we still haven't figured out how to map the full SJS semantics into them.

As for the 'true' concurrency vehicles, WebWorker, threads, etc., it is actually a very common misconception that these could help in the implementation of something like SJS. Let me try to clear this up (apologies, this goes off on a bit of a tangent):

There are two quite different facets of concurrency that software developers are commonly exposed to, and the two are often mixed up.

Firstly, sometimes we want to deliberately *introduce* concurrency into a program. E.g. when we have an algorithm that executes too slowly on one CPU core, we might want to spread the load across several cores. Or we might even want to spread the load across several computers (in e.g. a map-reduce architecture). Threads, WebWorkers, XMLHttpRequests are all examples of instruments for explicitly introducing concurrency.

Secondly, once we have introduced concurrency into our program, we need to coordinate or orchestrate it in some way. I.e. we need to *reduce* the concurrency down to single consistent narrative. The instruments that introduce concurrency are not much help here.

SJS is all about making this distinction very clear. None of the operators of SJS (with the exception of hold()) introduce any concurrency into the system. They form a completely deterministic algebra concerned with orchestrating the existing concurrency in a program.

Unfortunately, historically these two concerns have not been clearly separated, and instruments for coordinating concurrency (locks, condition variables, monitors, etc) have been build directly onto the concurrency vehicles (threads in this case) themselves. I think it has become a consensus that this is not a good idea.
E.g. Simon Peyton Jones says on the matter ("Beautiful Concurrency", in Beautiful Code, edited by A. Oram, G. Wilson, O'Reilly, 2007. ISBN 0-596-51004-7):
"To make a long story short, today's dominant technology for concurrent programming - locks and condition variables - is fundamentally flawed. [...] [T]he fundamental shortcoming of lock-based programming is that locks and condition variables do not support modular programming. By "modular programming" I mean the process of building large programs by gluing together smaller programs. Locks make this impossible."

And Edward A. Lee sums up the problem with threads ("The problem with threads", IEEE Computer, vol. 29, no. 5, pp. 33-42, May 2006) like this: "They discard the most essential and appealing properties of sequential computation: understandability, predictability, and determinism. Threads as a model of computation, are wildly nondeterministic, and the job of the programmer becomes one of pruning that nondeterminism. [...] Rather than pruning nondeterminism, we should build from essentially deterministic, composable components. Nondeterminism should be explicitly and judiciously introduced where needed, rather than removed when not needed."

We view the operators of SJS as an example of the "deterministic, composable components" that Lee is referring to.

InfoQ: Apollo comes with support for CommonJS - what does that mean for the developer?

Apollo comes with support for the CommonJS module system ( http://www.commonjs.org/specs/modules/1.0/), which has emerged as the de-facto standard for modules on server-side JS (e.g. in node.js). The problem on the client-side is that 'normal' JS can't really load modules synchronously (as mandated by CommonJS) - you always have to pass in a callback. To my knowledge, Apollo is thus the only system that implements a CommonJS-style module system on the client side.

Apollo is MIT licensed and is available at the onilabs website.

Hello stranger!

You need to Register an InfoQ account or to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Jscex is another one by Zhao Jeffrey

Similar goal with different approach.

github.com/JeffreyZhao/jscex

CommonJS in the browser is possible by Kevin Dangoor

FWIW, I've implemented CommonJS loading in the browser myself. The CommonJS module spec says that what's in your require() calls needs to be a constant string. Given that, it's easy to load a module via XHR, pull out the dependencies and ensure that they're loaded.

Stratified JS is a neat thing, to be sure. I'm just clarifying about the CommonJS part at the end.

By the way, JavaScript 1.8 makes it possible to use generators (a la Python) to make async code look more synchronous. While it's too early to tell for sure, there's the possibility that this feature could get into ECMAScript Harmony.

Kevin

Many similar efforts by Dio Synodinos

This is a very good article on a very interesting subject.

Just wanted to add the following 2 points:

  • The majority of things you want to do in the browser (or even in Node.js) are inherently asynchronous and that's the reason for the callback voodoo. If you try to make asynchronous stuff, run in a sequential manner, they end up taking more time (bad performance). This asynchronous environment is completely different from what people learn in Universities, but in order to be effective you must embrace it!

  • On the other hand, there are many libraries out there to help JavaScript developers implement specific kinds of control flow, with simple DSL-like syntax. There must be >10 in github, from which Step and flow-js are the most popular ones.


BTW, the spaghetti icon is unfair :p

Re: Many similar efforts by Werner Schuster

Dio,

Quote: "If you try to make asynchronous stuff, run in a sequential manner, they end up taking more time (bad performance). "

First off - I'd like to see some data backing that up (and no pointing to the overhead of threads, there are perfectly fine solutions in C# 5.0, Ruby, Python, etc all using Coroutines & friends to do their work).

See... every time you issue an async call with a callback in Javascript you _are_ writing a sequential flow of logic, except you're doing it in a verbose way by stringing together lambdas, extracting state to objects/statemachines, etc.
Why? Because most (all?) our algorithms are sequential, ie. we do something like

x = getA()
y = getB()
z = intermingleInInterestingWays(x,y)
z.sendTo(Foo);

That's how we think about these algorithms and that's the most concise way to write them down. A language or library that permits me to do that allows me to write better code. Sequential code can be described with a simple flowchart; manually written async code needs a stack of UML diagrams.


Another example: Every time you write something long the lines of

var promise = posix.unlink("/tmp/hello");
promise.addCallback(function () {
sys.puts("successfully deleted /tmp/hello");
});

you actually want to write:

posix.unlink("/tmp/hello");
sys.puts("successfully deleted /tmp/hello");


See - sequential code where the individual steps are strung together by a semicolon, not a collection of callbacks, promises, etc. What's more: the Javascript code that has to emulate sequential execution like that has a cost

  • - at execution time for the creation of closures and other overhead;

  • - for the developer who has to take a perfectly fine sequential algorithm and transform it into this, thus making the code harder to read (more code is harder to read) and harder to maintain. Not to mention that the explicitly async code has the execution mechanism baked in; compare that to languages like Erlang or Go which allow you to write sequential code in a straightforward way but choose the best way to execute the logic behind the scenese.



It's good to hear that there are libraries out there that help to get us back to the simpler versions of writing code. The library that Jeffrey Zhao mentions above is a neat version of F#'s workflows. As a matter of fact, I'd really suggest looking at what C# 5.0 is doing:
blogs.msdn.com/b/ericlippert/archive/2010/10/29...
It gives you a very interesting concise way of writing the sequential logic while allowing the programmer to decide about the actual execution method later on; it can be async code but it's also possible to decide to run it across different threads, one thread, etc.
Underneath it's quite close to what an earlier C# version added with their Iterator support.

Re: Many similar efforts by Dio Synodinos

Hi Werner,

It has been a while since I've studied queueing theory, so I can't give you a mathematical proof, but your 2nd and 3rd code snippet make it easy to explain it in plain English. Actually these two pieces of code have similar but not identical semantics:

2nd=> Try to unlink and while you are waiting for the unlink, you can do other stuff. This is the way event loops achieve their good performance. This also the reason why rich web apps don't freeze every few seconds. E.g. you do not want to block while you're getting data out of SQLlite in your Safari browser!

3rd=> Try to unlink and until you have unlinked, don't do anything else (block).

It's pretty obvious to me that time-sharing for asynchronous (non blocking) is likely to be better (ignoring threads). Isn't it?

On the other hand there are cases, when you definitely WANT to block, even in a highly async environment. E.g. you want to call several async functions in a specific order (DB READ/WRITEs) and you want to avoid the default way that produces long nesting.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

5 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT