ECMAScript 5, Caja and Retrofitting Security, with Mark S. Miller

1. Mark, would you like to introduce yourself?

I’m Mark Miller, I’m a research scientist at Google. I helped design the Caja Project which was our current solution for securing JavaScript. It’s a complex server-side source translator that actually solves all the problems of all four elements of web content, which is it sanitizes HTML, it sanitizes CSS, it translates from the secure subset of JavaScript into the JavaScript that runs on browsers and it wraps the browser DOM API. That’s Caja - it’s short for "Capability JavaScript", it’s also the Spanish word for "box" or "volt" and it’s currently being used at scale by Yahoo. It’s being used in the Yahoo application platform, their opensocial platform and it’s also available on Google’s own Orkut and iGoogle and it’s available on Shindig and obviously more yet to come, but that’s what I can mention.

I’m also one of the representatives of Google to the ECMAScript committee and on the ECMAScript committee I took the insights from the Caja effort and I adapted them. I used those insights to help in the design effort that resulted in ECMAScript 5 and I’m pleased to say, as I was saying earlier in my talk, that whereas ECMAScript 3 the JavaScript that runs on today’s browsers was a real nightmare to secure, we succeeded and we deployed at scale, ECMAScript 5 is one of the easiest languages to secure that have ever been created. I’m currently experimenting with a very small initialization script that on a conformed ECMAScript 5 browser should result in an object capability secure environment with no runtime overhead after verification - it just imposes a static verification, after that no runtime overhead.

2. You have stated in the past that "JavaScript is one of the leakiest languages ever". Would you like to elaborate on which parts of JavaScript you find more problematic?

The JavaScript till now, speaking prior to ECMAScript 5 where we tried to fix these problems, historically JavaScript was not even statically scoped. It had all sorts of bizarre scope leakages and part of that was flaws in the spec which some browser makers then faithfully implemented. Things that were actually in retrospect bugs in the spec, so they became deployed bugs. Then there were other problems where browser makers would do something other than what was in the spec, violated the spec in various ways. There were cases that were unspecified where browser makers just ended up with wildly different semantics.

So of all the different leaks, the scoping was the worst nightmare in JavaScript. There were places where you’d think that a variable lexically scope would actually get looked up on object.prototype instead, then I can go on and on. There is a YouTube video that I have on changes to JavaScript part1 ECMAScript 5 where I go through some of these scoping.

3. Would you like to give us a short overview on these changes - how they became in ECMAScript 5?

The most important thing in ECMAScript 5 is its strict mode. In ECMAScript 5 if you put the "use strict"; at the top of your program, and if you are on an ECMAScript 5 browser for which there are no non-beta shipping ECMAScript 5 browsers yet (but all browser makers are working on them) and if you put that on the top of your script, then all the code that is the rest of your script will be interpreted by saner scoping rules in particular "all scope is static". So from every user occurrence of a variable you can tell statically what the corresponding defining occurrence is. Those rules are I’m going to say "lexical scoping", but now I need to qualify it because when people hear "lexical scoping" coming from other languages, they might think about block level of lexical scoping which is what you should expect.

In ECMAScript 5 var still hoists to the function level, so you still have function level lexical scoping, but it is lexical scoping. The other really important change in ECMAScript 5 is in ECMAScript 3 objects were pervasively and unavoidably mutable. Anybody who had an object could go ahead and replace fields on the object, could add news fields to the object, could delete fields that were on the object and as a result nothing had integrity. Any two pieces of code that were sharing objects in common could basically arbitrarily violate each other’s assumptions by what’s sometimes called "prototype poisoning" for example. They can just modify object.prototype and array.prototype, install their own push method or whatever and completely disrupt their environment.

ECMAScript 5 brings in what we call an attribute control API. We call it that because in JavaScript an object has named properties and a property has a set of attributes. The attributes which go back to ECMAScript 3 and perhaps are older, (I don’t know, my history starts with ECMAScript 3). The attributes are whether the attribute is writable, whether it’s numerable or whether it’s configurable. There previously has been no way for JavaScript code to control those attributes.

4. Where are these attributes available?

These are attributes of every property, so they control for each property on each object.

5. Are they available to the programmer?

They are now available to the programmer with ECMAScript 5. With ECMAScript 3 they existed, they were necessary to explain the semantics of DOM objects, but the JavaScript programmer could not create objects that had properties that were not writable, could not create properties that were not enumerable, could not create properties that were not deleteable. Now they can and, furthermore, there is a very important primitive called "object.freeze" which will go through all of the direct properties of the object, all the properties that are on the object itself and it will make each of them non-writable and non-configurable (non-configurable meaning that you can no longer delete the property and you can no longer change the attributes of the property).

So once all the properties are non-writable and non-configurable and then the object as a whole is made non-extensible (meaning you can’t add objects to it), now for the first time a JavaScript object has the property classically associated with an object. As Allen Kay meant an object to be, which is that the object becomes in control of what the semantics are if two other pieces of code share access to the object, how can those two other pieces of code interact with each other by virtue of sharing the object? When an object is frozen, then the clients of the object no longer disrupt the object, no longer do things to the object other than what the author intended. So we refer to this as making the object "tamper-proof".

I should mention one other thing going back to the strict mode is strict mode also changes some other things that were encapsulation leaks, even aside from the scoping confusions. ECMAScript 5 Strict Mode functions are truly encapsulated so you put all of these things together and now, if you create an object by creating a frozen record, whose fields are methods, that encapsulate the state of the objects with the lexically encapsulated state is what Douglas Crockford calls "the objects as closure pattern". Lexically encapsulated state is where you hide your instance variables.

Now you have a tamper-proof object of encapsulated methods hiding lexically scoped instance variables; now you really have a defensable object and that’s really the basis for why ECMAScript 5 is now trivially securable whereas ECMAScript 3 was a nightmare to secure.

6. I suppose these changes make also JavaScript easier to debug because you don’t have problems with mixing up the scope and also probably for IDE developers it will be easier to come up with decent IDEs because most of the people I meet actually only use VI to write JavaScript.

One of the things I’m very much looking forward to is the IDEs that take advantage of the sanity of ECMAScript 5. The IDE support that we’ve seen from JavaScript so far has really been miserable and part of that is "How can you really provide much intelligent support to the programmer when you can’t even statically analyze the scoping relationships in the program?" With ECMAScript 5 strict code you can analyze those scoping relationships accurately with tamper-proof objects you can statically know much more about what might happen at runtime, you have much better basis for creating patterns that can be the subject of refactorings that have known properties, of various kinds of static analysis tools.

We are working with a grad student at Stanford named Ankur Taly who is doing a very ambitious escape analysis tool to help us find when references to objects are leaking or escaping from an abstraction in ways that violate the programmer’s expectations. That kind of analysis is now possible because of the regularities of ECMAScript 5.

7. You mentioned browser support earlier. What are the current plans for the browsers? Will ECMAScript 5 make it to Firefox 4?

Firefox 4 looks like it will be the first browser with full support for ECMAScript 5. The Firefox 4 development version (I’m not sure what they call it; if you download it, it’s called "Minefield" - it’s the nightly development version, but it’s basically where the current development version for Firefox 4) has essentially a feature-complete implementation of ECMAScript 5. There is one little nit that I know of, there are probably many nits that I haven’t been paying attention to, but I’ve been playing with it and the Minefield has been good enough to support my experiments in using ECMAScript 5 for trivial security. Furthermore, Firefox 4 will contain a very important new security feature that’s post ECMAScript 5 that comes from Tom Van Cutsem and myself. Tom Van Cutsem is a professor at the VUB in Belgium and he and I worked on a system called Proxies, which he just presented at the most recent Dynamic Languages Symposium.

We worked closely with Mozilla as Mozilla proceeded to implement it. They are going to be shipping a full implementation of our Proxies proposal in Firefox 4. Now let me explain what Proxies are about. When doing security patterns, a very important kind of security pattern is an inter-position pattern. These are also very useful for debugging purposes; they are useful for many purposes. Object oriented programmers like to interpose objects between some client and some other object for many purposes, debugging being one of them. The problem with JavaScript historically is that there was no generic way to intercept a message, to intercept a property access. Some particular browsers have some specific non-standard ways, if there was even a de facto standard across the browsers and those specific ones had some bizarre not well thought out semantics.

The Proxies proposal is a very principled, general means for creating a proxy object such that any operation you do on the proxy object gets intercepted and is then trapped as we call it, to a handler object, but the handler object is now basically operating at a meta-level and it can treat the operation in a generic fashion, perform the operation as on behalf of the proxy, if proxy was just doing it directly as an object. So the big test of this is DOM intermediation. One of the things that was very difficult in the Caja Project that besides securing JavaScript itself, you want to be able to have untrusted code on a web page and give it a sub-tree of our DOM tree, but not give it the entire DOM tree. To do that, we want to create a fake DOM tree in front of the sub-tree of the DOM tree, where the fake DOM tree seems like it’s a DOM tree for a whole frame, but all its operations are just mapped onto this sub-tree of the real DOM tree.

The problem is that the real DOM objects have dynamic sets of properties and they have all sorts of crazy behavior that it used to be the JavaScript objects could not emulate. Even in ECMAScript 5 you still can’t do a faithful emulation of all the bizarre behavior of DOM objects. HTML 5 makes this problem worse because ECMAScript 5 was trying to close the gap to enable more of what DOM objects could express to be expressible in JavaScript. The two groups were loosely enough coordinated that we didn’t really notice until it was too late that while we were closing the gap, they were widening the gap; they were making more use of those special things that DOM objects could do that JavaScript could not emulate and thereby creating a larger surface area where that could not be intermediated.

8. Could you give us a practical example about those things that it cannot emulate?

The HTML 5 local storage object is a dynamic set of keys and values where the keys represent basically the things stored in the storage and the keys are property names and they are mapped to JavaScript by the property names, so you can’t faithfully emulate that without intercepting unanticipated property names. In ECMAScript 3 you could not do that; in ECMAScript 5 you still can’t do that. That’s an example of the widening of the gap is there were fewer such cases of the need to intercept unanticipated property names in the HTML 4 DOM (there is now more). Fortunately, the Proxies system that is going to be shipping in Firefox 4 does allow all of those interceptions. As far as we can tell, it allows all of the interceptions needed in order to faithfully emulate the DOM.

9. Does it introduce any constructs to the language?

The Proxies proposal is a new construct. It’s not a construct in the sense of a syntactic construct, it’s a construct in the sense of a service provided by the built-in library, but it’s a fundamental service in the sense that it’s a service that cannot be written in the language. If it’s not primitively provided, there is no way without code transformation for a program to provide the service for itself and that’s this generic interception of a property access for unanticipated property names.

10. Are you aware if any of the tools that are currently available for debugging JavaScript have been getting ready for ECMAScript 5? For example Firebug.

I don’t know. It’s a fascinating question.

11. After a point it’s extremely painful to debug JavaScript using anything.

There is lots of work on debugging JavaScript and there is lots of work on implementing ECMAScript 5. What I don’t know is how much of the current debugging work, if any, is specifically targeting ECMAScript 5. I would hope that all of it is, but I don’t know. I have not been following that.

12. Maybe have you heard about what ECMAScript 5 adoption is in other engines, outside of the browser?

Outside the browser (we’ll start with that first) the Rhino which is the JavaScript interpreter (it can run either interpretatively or can compile the bytecode, but I’m just going to call it an interpreter) that runs on top of the JVM has a full implementation of the new ECMAScript 5 APIs, so all of this attribute manipulation, including object.freeze. It has, as far as I know, a full implementation of all that. It’s at least feature complete, in the sense that all of the APIs are there. I don’t know how fully tested it is because the ECMAScript 5 tests themselves are only now accumulating and they are very far from complete at the moment. But Rhino does not yet have ECMAScript 5 strict mode and therefore does not yet have the repair of scoping or the repair of encapsulation. Other than Rhino, as far as I know, all of the other users of JavaScript outside the browser are making use of JavaScript engines that were written for the browser -NodeJS using V8.

For each of the browser engines I think there is somebody using it on the server side. I should mention here the first complete implementation of ECMAScript 5 outside the browser. It was done independently, it’s called Besen. It was done in Object Pascal and it looks great. It was tremendously early, it was way ahead of everybody else and it just sort of popped up the list one day with the announcement that "I just quietly did this complete implementation of the ECMAScript 5 spec in Object Pascal and it seems to work".On browser adoption I was just sort of "Let’s do the roll call" so I’ll just speak for the visible development versions of each of the browsers.

For the tip development version, like I said, Firefox 4 seems to be feature complete, plus Proxies - they are doing very well. The JavaScript core from Web Kit and the engine that runs Safari seem to have either a feature-complete or an almost feature-complete (I think it’s feature-complete) for strict mode, but they are not feature-complete on the APIs, last time I looked. They did not have some of the attribute control APIs, they did not have, if I recall, object.freeze or object.prevent extensions, etc. so you can’t make your objects tamper-proof yet; I expect they will, fairly soon. V8 is incomplete in the opposite way - they seem to have a feature-complete implementation of all of the APIs, but they do not yet have strict mode. The IE 9 Preview might have become the IE 9 Beta (I haven’t been following that in detail).

13. I think the last one was Beta 2.

It’s basically in the same situation that V8 is in. It has a feature-complete implementation of all the APIs that come in with ECMAScript 5 and it has not yet implemented strict mode. For all the other browsers - I’m speaking about their development tip because it’s visible; those are open source projects. With IE of course I can only be speaking about the latest things that they’ve made visible. Opera, last I looked, was lagging well behind both on the APIs and on strict mode. As far as I know they have not started on strict mode either. And that’s the roll call of the browsers.

14. Going back to capability based security, would you like to explain to us the basic concepts behind that? Because it’s not a widely known principle.

The basic concept is very simple: the goal is to bring the power of object oriented programming, expressiveness of objects, to the expression of security patterns so that security patterns, arrangements of cooperation among mutually suspicious parties that are normally thought to be hard things to express can become much easier to express by bringing the expressiveness power of object programming. The way we do that is we start by observing that memory safety and encapsulation in object programming are already security primitives in a way. They already provide a form of access control, which is when an object is created its creator has the only reference to it. If some other object does not a reference to that object, it cannot invoke the object. You can only invoke an object you have a reference to, so references provide access.

The encapsulation means that when you have access to the object, you can only invoke the services it was designed to provide, you can’t reach in and steal the things that it’s using to provide those services. So, whatever services it’s using, those might represent more authority and the object in some way attenuates that authority providing less authority to the clients. All these are very nice primitives for building a security system from. Starting from conventional memory safety and encapsulation, there are only two further restrictions you need to impose: one is that an object cannot cause any effects other than by using object references that it’s holding. The other one is that an object is not given any powerful references by default. These restrictions together - memory safety, encapsulation, only references enable effects and no powerful references by default - give you the property that every object is born sandboxed.

Every object is born with no ability to cause effects to the world outside of itself and therefore you can deny authority to an object simply by not providing the object with the references to the other objects that would provide it that authority. These properties are compositional so that everything that I’ve just stated about restrictions on an object also applies to an arbitrary and dynamic subgraph of objects. Whatever happens within the subgraph of objects, the subgraph cannot cause effects in access of the references out of the subgraph that are held by that subgraph as a whole. Then, when you extend this across distributed systems, using the cryptographic analogue of memory safety, you have that property between machines as well that a machine now can be thought of as hosting a subgraph of the overall distributed object graph.

And even if the machine is corrupt and not following the protocol correctly, not implementing the language correctly, the cryptographic constraints prevent it from claiming to have a reference to any object that has not been to any of the objects that are on that machine.

15. Going a step backwards, you mentioned Caja. Would you like to practically explain to us how JavaScript development is with Caja. For example at the beginning Caja was some sort of translator, it translated JavaScript, but now this has changed.

Caja remains a translator and will remain a translator because Caja has to solve a problem for all of web content in an integrated fashion - HTML, CSS, JavaScript and the DOM API. Eventually, with enough browser support maybe we can move everything that Caja does to the client, but that’s not imminent. Once ECMAScript 5 has rolled out and is available in production browsers, then Caja will do feature testing of the browser to determine "Is this an inadequate ECMAScript 5 browser?" in which case we can do something much lighter weight as we won’t have to translate the JavaScript itself anymore. We can do all the securing of JavaScript with a very lightweight client side verifier. There are still some purposes why you might like to translate it on the server, but now it would be an option rather than a demand, but the testing would also allow us to fall back on translation when the target browser is not ECMAScript 5.

In anticipation of this, what we’ve done is we’ve upgraded the Caja translator so that what it used to do is translate from the secure subset that we identified in ECMAScript 3. The identification of that subset had influence on ECMAScript 5, but of course things diverged in the process of standardization. And I can genuinely say that it’s a diversion in a positive direction - the changes they made from those initial ideas were genuinely improvements. What we’ve now done is we’ve upgraded the Caja translator so that it accepts as its input language ECMAScript 5 and it translates from ECMAScript 5 into ECMAScript 3 that runs on the old browsers. The result is that today even without ECMAScript 5 on any of the browsers you can already use Caja to program to the future standards and it will work and be secure back to IE 6 on all the major browsers.

If you do that, then as the browsers start rolling out ECMAScript 5, your same code will now be able to run full speed on those ECMAScript 5 browsers.

16. You mentioned about some of this work being done in the server as an option with mod_pagespeed, the Apache module. It was released by Google a week ago, with basically all the things you need to do on your JavaScript image files for your site to be fast, it does it while serving the content for example it combines JavaScript files, it minifies them without the developer doing anything. Would a mod_caja sound [reasonable to you]? For example I could write my own JavaScript and a module in my server could treat that as it served it [to the client]

This is the first I’ve actually heard about this Apache module from Google.

17. I think it was announced two days ago.

I’m going to not comment on that since I just don’t know anything about it. You have better information on this that I do, but as far as your abstract question is concerned, Caja currently runs as a server side translator. It also runs as a Cajoling service. Whatever web server you are running, you don’t have to run Caja yourself as part of your web server, you can essentially call hour-to-hour Cajoling service, which is a service up on the web, post JavaScript code to us or have us fetch the JavaScript code, have us translate it and then send you back the translated one. There are a lot of different ways to run this.

18. At development time or online?

It depends on what you are using Caja for, but the scenario we built it for is for aggregators where you have a page that’s posting content multiple different third parties. So at the time they get the content from the third party, they want to take that content and make it safe. The traditional term would be "sanitize", but I want to talk about the enhanced meaning for "sanitize" that we’re bringing to the table here, which is sanitization is traditionally done is it’s removed all scripts, leaving us with dead data, which is this tragic loss because what we’re trying to do through sanitization is make it safe to handle media and code is the most general form of media. Media that has arbitrarily extensible interactions with the user. We would like our sanitization to get rid of the dangers of unauthorized abilities of the code while leaving all the possibilities for the code to engage in a legitimate interaction with its user as part of the media it represents.

We have this example that we’ve done to show how you can in a simple way use the Caja Cajoling service without having to run the translator yourself (our term for translation is "Cajoling"). If you go to http://caja-corkboard.appspot.com/ you will see the Caja Corkboard Demo, which is a blog-like thing. It itself does not run the Caja translator, it’s basically just a very simple blog-like application running as an Appspot application that when you type in HTML comments to post to the blog, it uses the Cajoling services at sanitizer. And the result of the sanitization is that the posted comments can be active comments that include running programs, but the comments still cannot attack each other and they cannot attack the containing page.

We have documented on the http://code.google.com/ page for the Google-Caja Project. It’s http://code.google.com/p/google-caja/ . We have documented on the wiki there how the Corkboard is constructed and how to make use of this Cajoling service in a very lightweight way to get started.

19. What is the performance penalty for the sanitized code? Does it come with a great performance penalty? Because the assumption is that you are going to use that probably in a scenario where you have mashups and performance is crucial when you have different things running on the page. What have you seen from practically using it?

Our previous version of Caja, before we upgraded it to translate from ECMAScript 5, had some quite significant performance penalties. (I don’t have numbers on top of my head, so I won’t quote them.) In upgrading to ECMAScript 5 we also made some changes that gave it much better performance. With the input being in ECMAScript 5 and target being in ECMAScript 3 browser, our micro benchmarks are showing between full speed (no slow down) and 4x. So there is always one micro benchmark that’s pathological because it just hits exactly the case that you bet on. The one pathological benchmark is I think a factor of 10 slowdown micro benchmark. We are now in the process of accumulating realistic figures for the macro benchmarks. Any time you are really concerned about application performance, only pay attention to macro benchmark figures.

We’re only now accumulating those figures, so watch the wiki at the http://code.google.com/p/google-caja/ site. We do not yet have realistic numbers that I know I can quote; I expect we will be posting them very soon.

20. You have talked in the past about the cost and the trade of retrofitting security not only in JavaScript but in other languages like Java, Python, Perl and more. Would you like to talk to us about your experience in this subject? For example JavaScript is a classical example of a language where security came later.

Any time you are retrofitting, by definition of retrofitting, security is coming later. I suppose one exception to that is Java where security by the object-capability model came later, but Java was built to satisfy a different set of security calls. If the actual history of it is more peculiar, which is before Java was Java, it was Oak, it was being developed for five years by Gosling and during those five years it had nothing like the security architecture or the security goals, as far as I know, associated with the early Java. The security architecture of the early Java, the Java 1 security architecture, was basically a quick sloppy retrofit in a hurry. I happened to be in a position where I was able to comment on some of those early designs and explain to them the alternative and primarily because of schedule pressure, of trying to get Java out quickly, without reworking the legacy of libraries that were already accumulated they were written without security in mind.

They went with the security manager architecture, with the rationale being that the security manager makes it easier to pull teeth after you ship, as you discover security problems. But understanding that the price of pulling teeth without a sound architecture is that by the time you’ve made it safe you might make it useless. This was on the context of Java applets and I think the uselessness of Java applets that was what that architecture ended up with at the end, is a large part of the reason why Java applets lost and they’re essentially gone from the world, why people don’t use Java applets. The Joe-E Project run by Adrian Mettler, with collaboration from David Wagner at Berkley is a very clean retrofit onto modern Java of the object-capability architecture that should have been the security architecture of the early Java.

It’s really a thing of beauty! It was really the inspiration for the possibility that you could retrofit security purely with a verification step, without any translation because Adrian succeeded of doing that for Java with the Joe-E. The beauty of verification only approach with no translation is that any Java code that passes the Joe-E verifier is a valid Joe-E program, but it’s still the same program, it’s still a Java valid program with the same meaning. The result is that all of your tools that are built for processing Java, your IDEs, your profilers, your debuggers, this entire range of tools is all immediately applicable to Joe-E. You can be coding in this secure subset of Java and inherit all of these tools with no extra work. When you do a translation-based approach, like we’re doing to secure ECMAScript 3, now you’ve got the problem of "How do you debug?" because what the programmer wrote was ECMAScript 5, but what the browser is running is the compilation of that ECMAScript 3.

Google has a project called "The Closure Inspector" which is basically a debugger for JavaScript code that is going through a source transformation. There are two clients of the Closure Inspector: one is the Closure compiler, which is a minifier for the purpose of shrinking JavaScript code and the other one is Caja. In both cases Closure Inspector is a plug-in to Firebug, which is a debugger plug-in to Firefox because there is no cross-browser debugging API, there is no standard way to debug cross-browser. To extend and existing debugger to debug across the source translation by basically mapping backward to the original sources is something that you have to do as separate work per browser. The beautiful things about being able to secure ECMAScript 5 without translation by doing it only by verification is like with Joe-E; the debugger can be directly debugging the source code that the programmer wrote. So that’s the Java experience.

I should say also the Java experience is another very interesting lesson from that, which is the security architecture that the Java language tries to provide, the claim that they make in their security architecture, which is valid for their security architecture, is "Our security only depends on the bytecode verifier. It does not depend on the compiler." The first two attempts at doing something like Joe-E was as an enhanced class loader from Chip Morningstar. The idea was to subclass the Java class loader to add an extra verification phase to verify that the bytecode satisfied these additional object-capability constraints and if so, then do a super class load to pass it to the real class loader. The problem that Adrian identified is that there was a large number of what the Java programmer would naturally think of as security checks, as security properties were enforced by the language, that it turns out they were only enforced by the compiler and not by the bytecode verifier.

Martin Abadi theorist at Microsoft characterizes these kinds of divergences as failures of full abstraction, which is a large topic in itself. But the lesson was that when you’ve got a compilation step, you should look for those kinds of failures and if you have them, then the thing to do is take a more constraining form of the language, which in the case of Java was the Java compiler, and extend that with the additional checks rather than extend the bytecode verifier with the additional checks. Fortunately we didn’t have that divergence issue with JavaScript. The source language is the delivery language.

Other experiences retrofitting security: the Scheme - Jonathan Rees’ PhD thesis (which we like to refer to as Rees’ thesis) is called A Security Kernel Based on Lambda Calculus. Jonathan Rees is the editor of many of the versions the Scheme report, one of the main people behind the shepherding of the Scheme standard over the years. His PhD thesis was "Designing an Object-capability Subset of Scheme" and it’s a very peculiar thesis. It’s a wonderfully peculiar thesis because it records the discovery that Scheme was accidentally almost a perfect object-capability language. Whereas most theses will go through the work that the student did and explain "Why I needed to do this to accomplish the goal", "Why I needed to do that to accomplish the goal", in the case of Rees’ thesis, Jonathan Rees went through Scheme and loving detail explained why he didn’t need to do anything or why he needed to do almost nothing, just sort of going over re-explaining the Scheme from the security perspective to show why it was already almost a perfect object-capability language.

The lesson from that as well as from this other reference, but I think that most clearly of that is security and modularity and good software engineering are just very well-aligned. Scheme, at the time that it was accidentally without anybody knowing it already an almost perfect object-capability language, was also the language that had everyone’s respect as an almost perfect programming language, if you accepted the premises. If you were doing that kind of language, Scheme was a design peak, it had everyone’s respect. It had very good abstraction mechanisms and the power of abstraction modularity was pursued in a very principled fashion. Pursuing the virtues of software engineering in a very principled fashion, took them almost object capabilities without security even being a goal.

Taming the Java libraries: I should go back to explain taming. I’ve already explained in the context of JavaScript that we have to subset a language, but then we have to wrap the DOM libraries. The wrapping of the DOM libraries is an example of a more general issue of taming when you are retrofitting. When we retrofit Java, we subset the language, but then there is this large number of libraries that we’d like to use. Those libraries weren’t written to object-capability principles so what we need to do is identify what is the subset of the interface of these libraries that retroactively we can see this subset happens to follow object-capability principles. If that subset is too small, then the taming as a taming exercise fails and we have to replace these libraries with new libraries that embody object-capability principles.

We’ve now tamed the Java libraries twice: once in the context of Joe-E, the other in the context of my previous language E, which is itself not a language retrofit, but runs on the JVM and does retrofit the libraries by taming. In both cases what we found was that the better the library was from a conventional object-oriented design perspective, the easier it was to tame.

21. In your previous answer you mentioned modularity in the context of security and that was also a theme in your presentation. You suggested that a natural product of modularity is security. Would you like to elaborate a little bit on that?

It’s more the other way around, but I would rather say they are aligned. If you pursue modularity the right way, it takes you in the direction of a more securable system, but it doesn’t necessarily take you over the threshold by itself. Scheme, for example, for all of its very principled attention to modularity principles it made it so that only a small step was needed to make it secure, but modularity by itself did not take it over that small step; you still had to pay attention to security in the end. Going the other way, pursuing security in the right way, in the context of the ECMAScript committee, getting these changes that I needed into ECMAScript 5 to make it securable, every one of the changes that we got into ECMAScript we got into mostly for software engineering reasons.

The primary argument for every single change that’s I’ve explained and every single change that we got in there as per security, was always software engineering because if the JavaScript is used for so many different things that if the change doesn’t serve many purposes well and doesn’t serve software engineering well, then it becomes a much harder argument to get the change into the language. This success of those arguments I also consider to be a good corroboration of this alignment between security and modularity.

22. Would you like to tell us a little bit about distributed resilient secure ECMAScript?

The basic idea is that with these object protections that we’ve talked about, the idea of doing distributed object computing has been an idea that’s been around for a long time. Most previous systems that have tried to do it haven’t done it well. My E language, I believe, has done it well and so does Tom van Cutsem’s Ambient talk language. (Tom van Cutsem was the author of the Proxies proposal that I mentioned before.) What you are doing is you are taking the reference graph among objects and stretching it over the network. You have this distributed graph of objects that is spread out and basically is an overlaying network over the graph of the internet.

But now you want to preserve the analogue of memory safety, so we do that in distributed resilient secure ECMAScript by mapping an object reference to an unguessable HTTPS URL and then remote message gets mapped into a RESTful GET or POST message. From the object perspective, the RESTful protocol is just the transport to hook together the remote address bases together into a virtual single address base if you will, skipping over a few details there.

23. Until now it sounds terribly familiar to CORBA.

CORBA is a great example of someone that did it so wrong as to ruin distributed object programming for a generation because anytime the idea came up, all anybody could think about was "Oh, no! We don’t want CORBA again!" It’s really amazing to me that a single very prominent bad system like that can wipe out a memory of many earlier good systems like the Eden Emerald System done at the University of Washington.

From the point of view of the objects, you just have a distributed object system. From the point of view of the protocol, you just have RESTful web services where you have services doing GETs and POSTs to URLs passing payloads and URLs as arguments where the URLs are HTTPS unguessable URLs. So the access to the URL is the demonstration of ability to invoke the object and you have the URLs per object to be invoked.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

Bio

About the conference

This content is in the Architecture topic

Related Topics:

Sponsored Content

Related Editorial

Related Sponsored Content

Popular across InfoQ