Bio Mark S. Miller is a research scientist at Google, main designer of the E and Caja secure programming languages, a pioneer of agoric (market-based secure distributed) computing, an architect of the Xanadu hypertext publishing system, and a representative to the EcmaScript committee.
QCon is a conference that is organized by the community, for the community.The result is a high quality conference experience where a tremendous amount of attention and investment has gone into having the best content on the most important topics presented by the leaders in our community.QCon is designed with the technical depth and enterprise focus of interest to technical team leads, architects, and project managers.
The most important thing in ECMAScript 5 is its strict mode. In ECMAScript 5 if you put the "use strict"; at the top of your program, and if you are on an ECMAScript 5 browser for which there are no non-beta shipping ECMAScript 5 browsers yet (but all browser makers are working on them) and if you put that on the top of your script, then all the code that is the rest of your script will be interpreted by saner scoping rules in particular "all scope is static". So from every user occurrence of a variable you can tell statically what the corresponding defining occurrence is. Those rules are I’m going to say "lexical scoping", but now I need to qualify it because when people hear "lexical scoping" coming from other languages, they might think about block level of lexical scoping which is what you should expect.
In ECMAScript 5 var still hoists to the function level, so you still have function level lexical scoping, but it is lexical scoping. The other really important change in ECMAScript 5 is in ECMAScript 3 objects were pervasively and unavoidably mutable. Anybody who had an object could go ahead and replace fields on the object, could add news fields to the object, could delete fields that were on the object and as a result nothing had integrity. Any two pieces of code that were sharing objects in common could basically arbitrarily violate each other’s assumptions by what’s sometimes called "prototype poisoning" for example. They can just modify object.prototype and array.prototype, install their own push method or whatever and completely disrupt their environment.
These are attributes of every property, so they control for each property on each object.
I should mention one other thing going back to the strict mode is strict mode also changes some other things that were encapsulation leaks, even aside from the scoping confusions. ECMAScript 5 Strict Mode functions are truly encapsulated so you put all of these things together and now, if you create an object by creating a frozen record, whose fields are methods, that encapsulate the state of the objects with the lexically encapsulated state is what Douglas Crockford calls "the objects as closure pattern". Lexically encapsulated state is where you hide your instance variables.
Now you have a tamper-proof object of encapsulated methods hiding lexically scoped instance variables; now you really have a defensable object and that’s really the basis for why ECMAScript 5 is now trivially securable whereas ECMAScript 3 was a nightmare to secure.
We are working with a grad student at Stanford named Ankur Taly who is doing a very ambitious escape analysis tool to help us find when references to objects are leaking or escaping from an abstraction in ways that violate the programmer’s expectations. That kind of analysis is now possible because of the regularities of ECMAScript 5.
Firefox 4 looks like it will be the first browser with full support for ECMAScript 5. The Firefox 4 development version (I’m not sure what they call it; if you download it, it’s called "Minefield" - it’s the nightly development version, but it’s basically where the current development version for Firefox 4) has essentially a feature-complete implementation of ECMAScript 5. There is one little nit that I know of, there are probably many nits that I haven’t been paying attention to, but I’ve been playing with it and the Minefield has been good enough to support my experiments in using ECMAScript 5 for trivial security. Furthermore, Firefox 4 will contain a very important new security feature that’s post ECMAScript 5 that comes from Tom Van Cutsem and myself. Tom Van Cutsem is a professor at the VUB in Belgium and he and I worked on a system called Proxies, which he just presented at the most recent Dynamic Languages Symposium.
The Proxies proposal is a new construct. It’s not a construct in the sense of a syntactic construct, it’s a construct in the sense of a service provided by the built-in library, but it’s a fundamental service in the sense that it’s a service that cannot be written in the language. If it’s not primitively provided, there is no way without code transformation for a program to provide the service for itself and that’s this generic interception of a property access for unanticipated property names.
I don’t know. It’s a fascinating question.
For each of the browser engines I think there is somebody using it on the server side. I should mention here the first complete implementation of ECMAScript 5 outside the browser. It was done independently, it’s called Besen. It was done in Object Pascal and it looks great. It was tremendously early, it was way ahead of everybody else and it just sort of popped up the list one day with the announcement that "I just quietly did this complete implementation of the ECMAScript 5 spec in Object Pascal and it seems to work".On browser adoption I was just sort of "Let’s do the roll call" so I’ll just speak for the visible development versions of each of the browsers.
It’s basically in the same situation that V8 is in. It has a feature-complete implementation of all the APIs that come in with ECMAScript 5 and it has not yet implemented strict mode. For all the other browsers - I’m speaking about their development tip because it’s visible; those are open source projects. With IE of course I can only be speaking about the latest things that they’ve made visible. Opera, last I looked, was lagging well behind both on the APIs and on strict mode. As far as I know they have not started on strict mode either. And that’s the roll call of the browsers.
The basic concept is very simple: the goal is to bring the power of object oriented programming, expressiveness of objects, to the expression of security patterns so that security patterns, arrangements of cooperation among mutually suspicious parties that are normally thought to be hard things to express can become much easier to express by bringing the expressiveness power of object programming. The way we do that is we start by observing that memory safety and encapsulation in object programming are already security primitives in a way. They already provide a form of access control, which is when an object is created its creator has the only reference to it. If some other object does not a reference to that object, it cannot invoke the object. You can only invoke an object you have a reference to, so references provide access.
The encapsulation means that when you have access to the object, you can only invoke the services it was designed to provide, you can’t reach in and steal the things that it’s using to provide those services. So, whatever services it’s using, those might represent more authority and the object in some way attenuates that authority providing less authority to the clients. All these are very nice primitives for building a security system from. Starting from conventional memory safety and encapsulation, there are only two further restrictions you need to impose: one is that an object cannot cause any effects other than by using object references that it’s holding. The other one is that an object is not given any powerful references by default. These restrictions together - memory safety, encapsulation, only references enable effects and no powerful references by default - give you the property that every object is born sandboxed.
Every object is born with no ability to cause effects to the world outside of itself and therefore you can deny authority to an object simply by not providing the object with the references to the other objects that would provide it that authority. These properties are compositional so that everything that I’ve just stated about restrictions on an object also applies to an arbitrary and dynamic subgraph of objects. Whatever happens within the subgraph of objects, the subgraph cannot cause effects in access of the references out of the subgraph that are held by that subgraph as a whole. Then, when you extend this across distributed systems, using the cryptographic analogue of memory safety, you have that property between machines as well that a machine now can be thought of as hosting a subgraph of the overall distributed object graph.
And even if the machine is corrupt and not following the protocol correctly, not implementing the language correctly, the cryptographic constraints prevent it from claiming to have a reference to any object that has not been to any of the objects that are on that machine.
In anticipation of this, what we’ve done is we’ve upgraded the Caja translator so that what it used to do is translate from the secure subset that we identified in ECMAScript 3. The identification of that subset had influence on ECMAScript 5, but of course things diverged in the process of standardization. And I can genuinely say that it’s a diversion in a positive direction - the changes they made from those initial ideas were genuinely improvements. What we’ve now done is we’ve upgraded the Caja translator so that it accepts as its input language ECMAScript 5 and it translates from ECMAScript 5 into ECMAScript 3 that runs on the old browsers. The result is that today even without ECMAScript 5 on any of the browsers you can already use Caja to program to the future standards and it will work and be secure back to IE 6 on all the major browsers.
If you do that, then as the browsers start rolling out ECMAScript 5, your same code will now be able to run full speed on those ECMAScript 5 browsers.
This is the first I’ve actually heard about this Apache module from Google.
It depends on what you are using Caja for, but the scenario we built it for is for aggregators where you have a page that’s posting content multiple different third parties. So at the time they get the content from the third party, they want to take that content and make it safe. The traditional term would be "sanitize", but I want to talk about the enhanced meaning for "sanitize" that we’re bringing to the table here, which is sanitization is traditionally done is it’s removed all scripts, leaving us with dead data, which is this tragic loss because what we’re trying to do through sanitization is make it safe to handle media and code is the most general form of media. Media that has arbitrarily extensible interactions with the user. We would like our sanitization to get rid of the dangers of unauthorized abilities of the code while leaving all the possibilities for the code to engage in a legitimate interaction with its user as part of the media it represents.
We have this example that we’ve done to show how you can in a simple way use the Caja Cajoling service without having to run the translator yourself (our term for translation is "Cajoling"). If you go to http://caja-corkboard.appspot.com/ you will see the Caja Corkboard Demo, which is a blog-like thing. It itself does not run the Caja translator, it’s basically just a very simple blog-like application running as an Appspot application that when you type in HTML comments to post to the blog, it uses the Cajoling services at sanitizer. And the result of the sanitization is that the posted comments can be active comments that include running programs, but the comments still cannot attack each other and they cannot attack the containing page.
We have documented on the http://code.google.com/ page for the Google-Caja Project. It’s http://code.google.com/p/google-caja/ . We have documented on the wiki there how the Corkboard is constructed and how to make use of this Cajoling service in a very lightweight way to get started.
19. What is the performance penalty for the sanitized code? Does it come with a great performance penalty? Because the assumption is that you are going to use that probably in a scenario where you have mashups and performance is crucial when you have different things running on the page. What have you seen from practically using it?
Our previous version of Caja, before we upgraded it to translate from ECMAScript 5, had some quite significant performance penalties. (I don’t have numbers on top of my head, so I won’t quote them.) In upgrading to ECMAScript 5 we also made some changes that gave it much better performance. With the input being in ECMAScript 5 and target being in ECMAScript 3 browser, our micro benchmarks are showing between full speed (no slow down) and 4x. So there is always one micro benchmark that’s pathological because it just hits exactly the case that you bet on. The one pathological benchmark is I think a factor of 10 slowdown micro benchmark. We are now in the process of accumulating realistic figures for the macro benchmarks. Any time you are really concerned about application performance, only pay attention to macro benchmark figures.
We’re only now accumulating those figures, so watch the wiki at the http://code.google.com/p/google-caja/ site. We do not yet have realistic numbers that I know I can quote; I expect we will be posting them very soon.
Any time you are retrofitting, by definition of retrofitting, security is coming later. I suppose one exception to that is Java where security by the object-capability model came later, but Java was built to satisfy a different set of security calls. If the actual history of it is more peculiar, which is before Java was Java, it was Oak, it was being developed for five years by Gosling and during those five years it had nothing like the security architecture or the security goals, as far as I know, associated with the early Java. The security architecture of the early Java, the Java 1 security architecture, was basically a quick sloppy retrofit in a hurry. I happened to be in a position where I was able to comment on some of those early designs and explain to them the alternative and primarily because of schedule pressure, of trying to get Java out quickly, without reworking the legacy of libraries that were already accumulated they were written without security in mind.
They went with the security manager architecture, with the rationale being that the security manager makes it easier to pull teeth after you ship, as you discover security problems. But understanding that the price of pulling teeth without a sound architecture is that by the time you’ve made it safe you might make it useless. This was on the context of Java applets and I think the uselessness of Java applets that was what that architecture ended up with at the end, is a large part of the reason why Java applets lost and they’re essentially gone from the world, why people don’t use Java applets. The Joe-E Project run by Adrian Mettler, with collaboration from David Wagner at Berkley is a very clean retrofit onto modern Java of the object-capability architecture that should have been the security architecture of the early Java.
It’s really a thing of beauty! It was really the inspiration for the possibility that you could retrofit security purely with a verification step, without any translation because Adrian succeeded of doing that for Java with the Joe-E. The beauty of verification only approach with no translation is that any Java code that passes the Joe-E verifier is a valid Joe-E program, but it’s still the same program, it’s still a Java valid program with the same meaning. The result is that all of your tools that are built for processing Java, your IDEs, your profilers, your debuggers, this entire range of tools is all immediately applicable to Joe-E. You can be coding in this secure subset of Java and inherit all of these tools with no extra work. When you do a translation-based approach, like we’re doing to secure ECMAScript 3, now you’ve got the problem of "How do you debug?" because what the programmer wrote was ECMAScript 5, but what the browser is running is the compilation of that ECMAScript 3.
I should say also the Java experience is another very interesting lesson from that, which is the security architecture that the Java language tries to provide, the claim that they make in their security architecture, which is valid for their security architecture, is "Our security only depends on the bytecode verifier. It does not depend on the compiler." The first two attempts at doing something like Joe-E was as an enhanced class loader from Chip Morningstar. The idea was to subclass the Java class loader to add an extra verification phase to verify that the bytecode satisfied these additional object-capability constraints and if so, then do a super class load to pass it to the real class loader. The problem that Adrian identified is that there was a large number of what the Java programmer would naturally think of as security checks, as security properties were enforced by the language, that it turns out they were only enforced by the compiler and not by the bytecode verifier.
Other experiences retrofitting security: the Scheme - Jonathan Rees’ PhD thesis (which we like to refer to as Rees’ thesis) is called A Security Kernel Based on Lambda Calculus. Jonathan Rees is the editor of many of the versions the Scheme report, one of the main people behind the shepherding of the Scheme standard over the years. His PhD thesis was "Designing an Object-capability Subset of Scheme" and it’s a very peculiar thesis. It’s a wonderfully peculiar thesis because it records the discovery that Scheme was accidentally almost a perfect object-capability language. Whereas most theses will go through the work that the student did and explain "Why I needed to do this to accomplish the goal", "Why I needed to do that to accomplish the goal", in the case of Rees’ thesis, Jonathan Rees went through Scheme and loving detail explained why he didn’t need to do anything or why he needed to do almost nothing, just sort of going over re-explaining the Scheme from the security perspective to show why it was already almost a perfect object-capability language.
The lesson from that as well as from this other reference, but I think that most clearly of that is security and modularity and good software engineering are just very well-aligned. Scheme, at the time that it was accidentally without anybody knowing it already an almost perfect object-capability language, was also the language that had everyone’s respect as an almost perfect programming language, if you accepted the premises. If you were doing that kind of language, Scheme was a design peak, it had everyone’s respect. It had very good abstraction mechanisms and the power of abstraction modularity was pursued in a very principled fashion. Pursuing the virtues of software engineering in a very principled fashion, took them almost object capabilities without security even being a goal.
We’ve now tamed the Java libraries twice: once in the context of Joe-E, the other in the context of my previous language E, which is itself not a language retrofit, but runs on the JVM and does retrofit the libraries by taming. In both cases what we found was that the better the library was from a conventional object-oriented design perspective, the easier it was to tame.
21. In your previous answer you mentioned modularity in the context of security and that was also a theme in your presentation. You suggested that a natural product of modularity is security. Would you like to elaborate a little bit on that?
It’s more the other way around, but I would rather say they are aligned. If you pursue modularity the right way, it takes you in the direction of a more securable system, but it doesn’t necessarily take you over the threshold by itself. Scheme, for example, for all of its very principled attention to modularity principles it made it so that only a small step was needed to make it secure, but modularity by itself did not take it over that small step; you still had to pay attention to security in the end. Going the other way, pursuing security in the right way, in the context of the ECMAScript committee, getting these changes that I needed into ECMAScript 5 to make it securable, every one of the changes that we got into ECMAScript we got into mostly for software engineering reasons.
The basic idea is that with these object protections that we’ve talked about, the idea of doing distributed object computing has been an idea that’s been around for a long time. Most previous systems that have tried to do it haven’t done it well. My E language, I believe, has done it well and so does Tom van Cutsem’s Ambient talk language. (Tom van Cutsem was the author of the Proxies proposal that I mentioned before.) What you are doing is you are taking the reference graph among objects and stretching it over the network. You have this distributed graph of objects that is spread out and basically is an overlaying network over the graph of the internet.
But now you want to preserve the analogue of memory safety, so we do that in distributed resilient secure ECMAScript by mapping an object reference to an unguessable HTTPS URL and then remote message gets mapped into a RESTful GET or POST message. From the object perspective, the RESTful protocol is just the transport to hook together the remote address bases together into a virtual single address base if you will, skipping over a few details there.
CORBA is a great example of someone that did it so wrong as to ruin distributed object programming for a generation because anytime the idea came up, all anybody could think about was "Oh, no! We don’t want CORBA again!" It’s really amazing to me that a single very prominent bad system like that can wipe out a memory of many earlier good systems like the Eden Emerald System done at the University of Washington.
From the point of view of the objects, you just have a distributed object system. From the point of view of the protocol, you just have RESTful web services where you have services doing GETs and POSTs to URLs passing payloads and URLs as arguments where the URLs are HTTPS unguessable URLs. So the access to the URL is the demonstration of ability to invoke the object and you have the URLs per object to be invoked.