BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations WebAssembly. Neither Web Nor Assembly, All Revolutionary

WebAssembly. Neither Web Nor Assembly, All Revolutionary

Bookmarks
52:56

Summary

Jay Phelps talks about WebAssembly, a bytecode designed and maintained by some of the major players in tech: Google, Microsoft, Apple, Mozilla, Intel, LG, and many others. He talks about what WebAssembly is and what it isn’t.

Bio

Jay Phelps is the Chief Software Architect and Co-founder at This Dot, where they provide support, training, mentoring, and software design. Previously, he worked as a Senior Software Engineer at Netflix.

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Transcript

WebAssembly: neither web nor assembly, but still revolutionary. I'm going to talk about that today and what I mean by that. I'm going to introduce WebAssembly to you if you're not familiar with it, and explain the different aspects of it, the pros and cons, give you a little bit of history, talk about use cases, things like that. But then also look at the future somewhat, because that’s, to me, the most exciting part; that this is actually a revolution and it's very early in this revolution, right? You probably heard some buzz, but maybe you haven't been using this in your work, maybe you don't even know anyone who's actually using it yet, but I'm just so, so excited to talk about this. And QCon is such a diverse crowd of different types of people. So to try and get a better understanding, kind of choose your own adventure and tailor the talk to different, I want to kind of get a feel for my audience today.

If you could just raise your hand if this applies to you. Do you write Java full time? Okay, do you write C++ or C full time? Okay, great. Do you write JavaScript full time, I guess would be another question? Okay, good. A lot of people over there. Do you write all of those half times, parts in the middle, like you're just all over the place? Lots of different language is great. Okay, yeah. A lot of people are like that. And I'm that way, too. So I love languages. I love learning new programming languages, which is funny, because I'm really terrible at speaking languages. But I love programming languages.

Who am I? I am Jay Phelps, and I used to work at Netflix, but back in March, I started a company with my friend Tracy called This Dot. And we do a whole bunch of different things, which we'll talk about here in a second. If you're interested in any of this stuff and you want to talk to me or follow me or anything like that, I'm on Twitter @_jayphelps. A little bit about me, I'm a Google Developer expert. I obviously used to work at Netflix. I’m also on the WebAssembly community group, which is the group that does the specification process, and new proposals, and stuff like that.

So as far as This Dot, the company I founded, we do a whole lot of things. We're still trying to figure out exactly what we want to do. But right now, we're doing a lot of support contracts, developer relations, staff augmentation, mentorship, all sorts of things across the gamut from Java, JavaScript, WebAssembly, React, Angular, all sorts of different stuff. If any of that's interesting to you, you're welcome to reach out to me.

What is WebAssembly?

Let's just dive right in. The question, the ultimate question: what is WebAssembly? Well, it's also known as Wasm, or Wasm, depending on how you want to pronounce it, where you live. I pronounce it Wasm personally. And the quick spiel is that it's an efficient and safe low-level byte code for the web. And that sounds great, but we kind of need to unpact that. Really, what do I mean by efficient? Well, I mean the goal of it is to be fast to load and fast to execute. So fast to load meaning fast to send over the internet, because the primary purpose of WebAssembly is for the browser, but we'll talk about that a little bit later. So fast to load over the internet, small binary sizes are critical, compact format is critical, fast to load meaning fast to parse. And then fast to execute, meaning fast to actually run once it's compiled. So it's going to be just in time compiled.

Part of a critical component to WebAssembly is something called streaming compilation. And this is pretty novel for a lot of people. This is not something a lot of technologies can do. And the idea behind it is that as the bites are coming over the wire into the browser, the browser can actually compile those bites to native machine code as it's being downloaded. It does not have to wait for the entire file to finish downloading, it can parse and compile that WebAssembly to machine code, and that's critical because that means in a lot of cases, your internet connection is actually the bottleneck now. So if you're on a mobile device, your startup time is going to be directly correlated to how fast your internet speed, rather than just your CPU.

I made a little simple visualization to demonstrate what I mean by that. So I've got a mobile device here in the middle, and we've got a WebAssembly file on the left. Let's imagine that's on a server somewhere and we're going to download that file into the mobile browser. As it's being downloaded on the right, we're going to have the resulting machine cod;, it's going to get compiled by the browser to machine code. We'll take a chunk and that chunk will start to get downloaded through HTTP. And you can see that it's being compiled as the chunk and before that chunk has finished downloading, it's already been compiled to machine code. By the time the last chunk comes over, we're almost done. All we needed to do was finish compiling that last chunk. And that's critical for startup times.

This is a fairly new feature actually being supported by the browser, so it wasn't supported at first when WebAssembly was released. It's very new in some of the browsers. So if some of the startup times, or if you've played with WebAssembly in the past, and you're like, "Oh, the startup times weren't as great as promised," they just hadn't got there yet with the streaming compilation. I have this one more time for anyone who missed it. We've got the WebAssembly on the left, and you don't need to be able to read it. The idea is just that the chunk is being actually compiled as it's coming along.

The next part was “safe”. I talk about safety. And the web has enjoyed a really great precedent where we have incrementally over the years, very carefully added new features and new capabilities. What the web can do today, versus what it can do 10 years ago, is dramatically different, right? But we've been very critical about security. This has actually been kind of a way to step back and start over from scratch, whereas like the operating systems of the 80s and 90s gave you complete power, and then they kind of tried to take that power bac. The browsers have given us that opportunity to do it right from the very beginning. Be very overly cautious on security.

WebAssembly is going to follow it in that tradition. It's going to be sandboxed just like JavaScript is. It's not arbitrary code on your computer, it can't call system calls, and it can't directly access the file system. In fact, running in a browser, the only way I can interact with those things is using the same APIs JavaScript does. So you call through that JavaScript and you can make those file APIs, you can make those HTTP requests, but using the same APIs that we've standardized and we know are secure. Not just that, the sandboxing and stuff, WebAssembly itself has been designed with security in mind. You can see on the bottom there, there are things like control flow, integrity checks, stack protections, dynamic dispatch; these are getting really low level details, and I wanted to point out if you're not familiar with these, it's totally okay. But if you are, I think these are really exciting things about WebAssembly.

In WebAssembly, if you're familiar with how traditional native programs work compared to that, in WebAssembly, there is no arbitrary executable memory block. So a whole class of exploits cannot happen even if you compile your C or C++ to WebAssembly, you can still overrun a buffer, because the specification of C says you need to be able to do that. But if you overrun that buffer, the person can't stuff arbitrary executable code in there and have it executed. The actual instructions themselves were designed to prevent this from happening, how that works.

Now, that doesn't mean that every exploit is impossible on WebAssembly. I want to make it very clear that there's a class of ones that are prevented, but there's also a whole another ones that are almost impossible to prevent, because you need to be able to get real work done. Code reuse things, side channel attacks, and race conditions are all things that are absolutely still possible with WebAssembly. And as far as how possible they are, it ultimately comes up to the programming language. How easy is it for an attacker to do these type of things in the programming language that you're writing?

If some of this has been a little confusing, there's one thing I need to clarify. It was that low-level byte code. This is not something that you're going to be writing by hand. Now, you could write it by hand, just like you could write machine code by hand. But let's be honest, you probably won't. So WebAssembly is a portable binary instruction set for a virtual machine. So it's like a CPU instruction set, except for this is a virtualized machine, the goal being that you don't have to care what CPU they actually have, and if you're familiar with the JVM, with Java byte code, it's very similar to the idea behind that. So you've got these instructions which are binary, this is the hexadecimal representation of that. And then the binary below, this is the add instruction, adding 232 bit integers together. So this is intended as a compilation target, not something you typically write by hand. Instead, you take something like this, this is C or C++, and you'd compile that to the WebAssembly byte code on the right. And you'll notice that because it's binary, it's really not readable.

Primary Goals

So WebAssembly is that efficient and safe low level byte code for the web. I want to take a step back and talk about how did we get here? Why do we even have this WebAssembly thing? Well, for as long as we've been building web apps, real complex apps for the browser, for the web, there's been two primary goals that a lot of people have wanted. They've wanted to be able to use a language other than JavaScript. And ideally, they would like to have some improved performance when they do that. They want to use their C or C++. They want to take advantage of some facts that JavaScript is not a typed language. You can get some better performance when you use a typed language, typically; not always, but typically. Over the years it's funny that we've been working towards this goal with various aspects and JavaScript has kind of been making this less and less of actually a big issue. The JavaScript performance of eight years ago, compared to today, is light years different. JavaScript is really, really fast now. So the whole improved performance thing is actually less of an issue than it was.

But let's take a look at what we've done to get here. There's Java Applets. When I talk about WebAssembly for the first time, a lot people bring this up, they say, "Isn't this Java Applets all over again?" And to some extent, I see there's an absolute corollary, right? It's very similar in theory. In practice, there's a lot of reasons why Java Applets didn't take off, didn't do well. The biggest one being it wasn't actually integrated into the environment. It was basically this black box where you could just run some Java code and render into some sort of arbitrary Canvas thing. But you couldn't take advantage of the web's APIs and it had its own class of security exploits. Because it was kind of a second class citizen, it was not really dedicated and devoted because the browsers were not their primary focus.

But you may answer and you may come back and say, "That's fine. That was Java Applets, but the JVM is great. Why not just use the JVM or the CLR, the Common Language Runtime for the .net thing?" So this one's really complex and because of time, I'm not going to get really into it because I don't want to bore you, but really, it comes down to mostly misaligned goals. The CLR and the JVM are both actually really, really great virtual machines for what they're intended for. And what they're intended for was not what WebAssembly is intended for. Particularly, the biggest thing is that uniqueness of the web. We'll talk about that here in a second, that there are some unique challenges for the web that the JVM and the CLR were not designed for, and in a lot of ways is great to be able to just restart and create some new thing. But the big things are about validation, so being able to do single parse validation and compiling, the speed in which we can get that just in time compiled. Because that's so critical for the web.

But another thing is experiment by Google. I wouldn't say it's experiment, actually, it was a full-fledged effort by Google. Starting around 2011, and then again in 2015, with some additional work, there's something called Portable Native Client. There was a native client or NaCl before that and then the portable variant came. The idea behind this was to take LLVM, if you're familiar with that, that LLVM IR, standardize a subset of that IR and make that a target for the web. This was adopted by a lot of people within the chrome ecosystem. But the other browsers they saw some of the gaps and saw some of the things that they wanted to do differently and then they decided not to adopt this standard that they open sourced. So it wasn't proprietary, but effectively, Google was the only one who ended up actually implementing it.

Next up is asm.js. And this one, I think, got the most traction. This was spearheaded initially by Mozilla. And the idea behind this was take your take your C or C++ and compile that down to a very strict subset of JavaScript. A subset of JavaScript that might look kind of quirky, you're kind of like, “Huh” It's like we see that C on the left and this asm.js, this is a subset of JavaScript on the right. This is perfectly valid JavaScript, this will run in IE6 probably or at least run in IE7, I know that.

But there's some quirkiness to it, right? What does the bitwise or zero do? That's kind of taking advantage of the fact that in JavaScript, the way the specification works, that's saying that that's actually an integer instead of being a double. So this is basically a type annotation by taking advantage of some of the uniqueness of JavaScript that it provides. And this got us a lot of the way there; this actually did show us some very cool performance benefits. There were a lot of learnings behind asm.js. I think, really, I give asm.js the biggest credit for spearheading the initiative of WebAssembly.

But there are a couple of fundamental problems with it. It's still JavaScript, so we have some caveats to the fact that it's still JavaScript. It's still a textual representation. So it's not a binary encoding, so the file sizes actually were disappointing, to say the least. We can get much better if we're able to make it a machine readable binary rather than something that's intended to be human readable. The other thing is that we can't add new things easily because this is supposed to be just JavaScript, we can't add new cool things like SIMD or multithreading, all these things, without adding them to JavaScript which we did not want to do. This was primarily adopted by Mozilla, but Chrome and Edge also did some experimentation in their browser engines as well. They never formally supported asm.js, but they supported the pattern in general and did some optimizations around this.

WebAssembly is an Unprecedented Collaboration

But all this effort, basically, we've primarily got Google and Mozilla having somewhat competing efforts. They all started to align around asm.js. And they're like, you know what? What would it take to do it right? What would that look like? And that essentially is what WebAssembly is. They said, "Hey, Apple, hey, Microsoft, can't we all just come together and create something brand new that we know is right and start from the ground up?" And I really want to press this home. This is critical and I don't think enough people talk about this. WebAssembly is really unprecedented. All of the browser vendors getting together and creating - not just browser vendors, major companies- these major corporations, Microsoft, Google, Apple, Mozilla, even Intel has had a say in this. And Samsung, all these companies, they've come together and they've all created the first standardized byte code by all of these major companies. And it's free. It's not a proprietary program. There's no question that this is completely open. And it's not encumbered by patent laws and all that sorts of things.

But moving along, natural progression. So we've got this WebAssembly thing sitting next to JavaScript, right? They're able to talk to each other, but the question that I get I'd say the most often out of all of them is, is this going to kill JavaScript? Is that the goal? Am I just going to use another language? And by the way, I made this because I thought it was hilarious. I don't know if you're a big "Daffy and the Duck" guy, but I thought it was hilarious. But the answer to that question is no. I think that if you stop and take a look, JavaScript is a wonderful language. It has quirks, every language has quirks especially as it starts to show age. But JavaScript is a wonderful language and people love it. Not just because it has a monopoly.

So WebAssembly is not going to kill JavaScript. Now, I'm not a fortune teller. Of course, years from now, maybe WebAssembly will enable another language to compete against JavaScript neck and neck, for popularity. But it's really hard to compete with built-in. You don't need to download a compiler; you just open up a text editor, add a script tag and start learning JavaScript. That's so easy and so great. It's really hard to compete against them.

The next question that's kind of related to this is then, what about compiling our JavaScript to WebAssembly? If WebAssembly is so great, it has all these benefits, why wouldn't we just do that? Why not? And this one is really complex as well, but I'll kind of break it down a little bit. JavaScript is just an extremely dynamic language. Now, you may not use all of the dynamic features of JavaScript on a given day, but if you look at the specification, and you say, “I want to compile JavaScript, any arbitrary JavaScript to WebAssembly,” then that means all of those dynamic features. Even the ones you may not use.

Case in point, I wanted to point out one thing, this isn't necessarily a knock on JavaScript; this is intended behavior is in the specification. The issue with this though is that if you push, the way it uses array prototypes, it makes copies instead of class-based things, it makes copies of this prototype to make new objects. The methods are active, they're real, you can call methods on prototypes, even though you usually would not want to do so. And so you can actually push a value into the array and every subsequent array that you create will have that value at the Zero Index.

However, the array length is zero. So it kind of puts things into kind of a weird state. But this is perfectly valid JavaScript. Love JavaScript or hate it, it is JavaScript. And so if you say, "I can compile JavaScript to WebAssembly," you have to be able to compile this as well. While it's possible to do that, the point I'm trying to drive across is that fully spec compliance JavaScript would be slower overall. The browsers have phenomenal virtual machines that have been hardened, and hardened, and hardened. There's no reason for you to ship your own virtual machine to the browser to be able to do these things.

To this point, a strict subset of JavaScript could absolutely be compiled to WebAssembly and could be fast, potentially faster than JavaScript. So you really have to make that distinction, that nuance; you can't kind of paint with one color or the other. There's a gentleman, Sebastian Markbåge, he works at Facebook, and he works on a project called Prepack, and kind of a little bit of a troll or click-baity of tweet. But what if you could ahead of time compile your JavaScript to WebAssembly? And what he's referring to is an experiment they're doing with Prepack. And Prepack is designed primarily to optimize JavaScript to do ahead of time a valuation of JavaScript. But one of the ideas they had was, “Well, sometimes we can detect certain statically typed patterns. Even though JavaScript isn't statically type, we can detect at compile time. Oh, we know exactly what this function is going to do, what every value it possibly can take.” Then those certain cases, they've been experimenting with compiling it to WebAssembly. So you'll kind of have la mix of JavaScript and WebAssembly, things that it couldn't compile and things that it could.

And the idea is to do that automatically and transparent. It remains to be seen whether in practice, that's going to be useful because you do have to call across the bridge to call from JavaScript to WebAssembly, and you're loading a different file and there's caveats to this. Compiling a single function that all it does is add two numbers, that's not going to be worth it, right? JavaScript can run that just as fast as WebAssembly can.

WebAssembly v1 MVP is Best Suited for Languages like C/C++ and Rust

A lot of these caveats that I keep giving you have to do- it's not exclusively, but a lot of them do have to do with the fact that we're taking an incremental approach to WebAssembly. The V1, or the MVP, as we call it, is designed to work best for fairly low-level languages. Languages like the C, and C++, and the Rust. So it's ideal for that low-level system languages, languages with very little, if not no dynamic runtime type of behavior, no garbage collection, things like that. It's not to say you can't do these things in WebAssembly, it's just the cost of doing them probably isn't worth it at this point.

I did want to point out because some people have mentioned that I missed this before, is that C++ these days, especially with the very latest versions, is no longer really a systems language. It is a full-blown application language if you choose to use those new language features. I's true that some of those new language features don't compile to efficient WebAssembly currently. And a feature that's not new in the number of years, but relatively new depending on what circles you're actually programming in, is actually Exceptions. At least I was surprised at the number of people who still don't use Exceptions in C++ primarily because they have a lot of C and interrupt between that. But exceptions are one example where they don't compile to efficient WebAssembly. They absolutely can compile to WebAssembly, but they're not efficient yet. We don't have zero cost exceptions under WebAssembly currently.

I also don't want you to glaze out though. If you're not a C, C++, or Rust programmer, this still can apply to you. Other languages are already planned or in the working. In fact, if you use Go, Go has already released very early support for WebAssembly. Now, Go has some dynamic features or some features that are, I guess, more higher level I think would be more accurate to describe, than languages like C or Rust. It has garbage collection. And how the garbage collector has to work currently in WebAssembly isn't the most ideal in performance wise, so it's kind of an early experiment, but the proof is that people are really excited about this future and they're already experimenting. It's a stated goal of the WebAssembly community group to support these languages at some point, in the coming years as we work towards this.

How WebAssembly Will Impact Language Design and Implementation

I think because of how much this has been able to drum up excitement and how many people are already experimenting with WebAssembly from a language compiler perspective, I think I want to talk a little bit about how WebAssembly will impact language design and compiler implementation; the actual compilation of your language into WebAssembly. How will that impact it? The biggest thing is that the web has very unique considerations, considerations that we've just really never had and we could take for granted in traditional languages. And I wanted to really, really call out- even though Ashley is here- the rest of the team about this, because they have just been just so phenomenal about already adopting WebAssembly. They had it as a stated goal in 2018 to really double down on making WebAssembly a first-class care and citizen of the Rust programming language and it shows. They've done a ton, a ton, a ton of work. And thank you Ashley and team for doing that to really make Rust a beautiful language to compile to WebAssembly.

What sort of thing do you have to care about? What does the Rust team care about? What considerations do you have to make on the web? But the biggest one by far is file sizes. You know, if you've written an iOS app, or an Android app, or you've really done hardly any native programming, how often did you look at your bundle size, your resulting size? You probably didn't to be honest. Because it's not really a care within reason, right? Once you start to get over 500 megabytes, maybe you'll get on a blog post somewhere where people are complaining about you next to Facebook, where they're like, "Oh, look at your 500 megabyte bundle." Or you go to a gig or something like that. But for the most part, people don't really, you know, care or even think about those sort of things. But imagine trying to ship an app to the web that's 500 megabytes. No way, everyone would tell you, "Not going to happen.” Deal breaker immediately, your bounce rate would just be near 100%.

That's a really unique consideration for the web. People are very used to that instant gratification. I can go to your site, and I can use your app immediately without needing to wait, or download, or install, or anything like that. And this is all possible. It just requires being rare, very conscious of that, both at the compiler level in the tool chains, but also as an actual developer. So the languages have to be really careful; you can't just link an entire runtime library of code that you don't use, which is a very common thing. It's common to go into a native binary and see massive libraries where only a single function is used. And that's not a big deal. What do you do? But on the web, that's a deal breaker.

But there's even further things that you'll want to do if you want to get that real good experience. Things like lazy loading where you do code splitting. You split actual sections of your code base and load the code as you need it just in time, and that's really more complex. I would not be surprised if more languages, as the WebAssembly starts to get more popular, decide “We should do things like this at a language level. We should help you out. We should maybe statically analyze your things, give you primitives to give us hints or make things easier on you.” I'm really excited about experimentation with this. If you're on the web doing JavaScript type of stuff, Webpack and things like that have already experimented with providing those primitives and tricks.

The next thing is shared libraries; it's related to the file size thing. But not just that, not just the fact that you are linking in these giant libraries, but also the web doesn't really have this full-fledged UI toolkit nearly every platform is going to have. You don't have that. You have the DOM, and you have the host APIs, but they're far from as robust as iOS, or Android, or something like that. So that's an open question. If you have a popular library like Bootstrap or something like that, if there's some sort of equivalent where you're writing in some other higher level language and compiling that to WebAssembly, do we really want every website to have to download that over, and over, and over again? Or can we come up with some way where the most popular libraries can be dynamically linked and cached locally? We have your browser caches and stuff, but they weren't really designed for this. These are all very open questions. How can we make this a better environment for developers and for code reuse between applications?

The next one is offline. If you do do that lazy loading, and that code splitting, and you're caring about all those things, now, you have to deal with the opposite problem. What happens if you lose that internet connection? You have to handle that gracefully. And that's another consideration. Normally, for most applications, you either choose on iOS or Android. You're either assuming a cell connection or you're assuming not a cell connection. And some people have gotten better at that, but this is another problem that's pretty unique to the web.

The next one is interrupt with JavaScript. This one won't apply to everyone, because some people are going to say, “I want to write some other language and I'm not even going to touch JavaScript.” But for the most part, JavaScript is great. And there's going to be a lot of code reuse that you're going to want to do. So languages that are able to interrupt with JavaScript, I think, are going to have a major leg up. And what do I mean by interrupt? I mean, calling across them, being able to represent their objects and represent those things, in a graceful manner, not perfect, obviously. But some languages that were designed for the web already are language like Dart, or Elm, or Reason. And these are languages that all have a very good story about how do you interrupt with JavaScript? I think that these are actually pretty promising languages for the future of WebAssembly. None of them currently compile to WebAssembly, because they're actually pretty high level, but they all have expressed interest in doing so once WebAssembly has added the necessary features to make it efficient for them.

A Typescript-like Language?

As a person who's kind of an armchair language designer myself, I'm most excited about these languages and potentially brand new ones. New languages that you have probably never even heard of, and a couple things that are exciting are like the TypeScript like languages, right? If you know JavaScript, which is the majority of people, even if you're not an expert, you know JavaScript, right? It's a pretty easy language to get the fundamentals down. Well, what about TypeScript? A TypeScript-like language if we make it a subset of that. And there's already been early examples of this. There's a language called AssemblyScript, which if you know TypeScript, this is essentially TypeScript except for you notice that the type annotation is an i32. A32 bit integer rather than being a number, because it's not JavaScript.

But this is something that I would bet you within five minutes, everyone in this room would be able to write AssemblyScript and be able to be productive and write very performant code. Now, AssemblyScript currently is very early, and so it is really great for sort of algorithmic-type of functions that you would want to write, but it's not great for, "I want to write my entire application and I want to use the DOM," those type of things. It's not currently ready for that type of situation.

When Should I Target WebAssembly Right Now?

To touch on that caveat a little bit more, when should you target WebAssembly right now? There's when you should target in the future, but then when should you target right now? Well, right now, it really comes down to heavily CPU-bound computations on numbers. That might sound like, “Well, I never do that.” But I think you'd be surprised what ultimately boils down to that, especially when you deal with compiled … most of the C, and C++, and Rust, and stuff like that, that gets compiled down mostly to computations of numbers. Loads, and stores, and memory, and stuff like that.

A couple examples I want to touch on. So games are the most obvious; these are things that were touted very early in WebAssembly's initial launch. Both the Unity and Unreal Engine, both have support for WebAssembly. You can compile your game to WebAssembly today. There's still an open question on games on distribution model. In my opinion, why games haven't taken off on the web is game developers are still very cognizant to piracy. How do they get games to people and stuff? Because people aren't just going to come to their random website and download their game, right? So I think that's still an open question, but the technology is there now.

The next one is porting existing code. And I don't think you should discount this too much. There is a plethora of code that's been written for over the last 20 and 30 years that is highly valuable and easily portable. I's written in C, it's written in C++. That can be compiled to WebAssembly and just used in your JavaScript apps as is, and get great performance and a great implementation. You don't have to port that to JavaScript. If you don't have porting arithmetic in JavaScript, there are certain things you can't do as performant and things like that.

Case in point, these are just a small sample of libraries that I know of that have been compiled to WebAssembly. So if any of these are familiar, I imagine a number of them might actually be to you. But I think some of the most cool visual type of things are the video and the audio type of decoders, those type of algorithms that are really popular. In fact, case in point, Zoom, and if you've ever used that video conferencing app, they have a web client that you can use in Chrome that you don't have to have anything. You don't have to download anything, no extensions or anything. And it uses WebAssembly under the hood. It's running inside web workers. They have a web worker for the audio and a different thread for their audio and a worker for the video. It does the decoding in real time as it's coming over through the web socket using WebAssembly. I don't know what language they actually wrote it in, because I assume they did not write the WebAssembly by hand. But if I had to guess, it's probably C++ or Rust, but I don't actually know.

Next up that's kind of exciting for me is React Native DOM. If you've heard of React Native, it's a way of taking the React UI library that's for JavaScript for this UI application stuff. React Native is made for the web primarily. But you're able to run it with React Native. React is made for the web. React Native is made for running it on iOS and Android devices. However, there's a project called React Native DOM that lets you run React Native apps on the web.

So it kind of goes full circle. I know that's kind of the mind bending, but it's like React was made for the web then React Native was made for the devices, the mobile devices, and then React Native made it come back to the DOM. So it's kind of going full circle, but the idea behind this is being able to reuse code, because right now, React and React Native can't really reuse a whole lot of their actual react code. They can reuse maybe some business logic stuff, because it's just JavaScript, but the actual view itself can't be reused. And that's what React Native DOM does, is it lets you reuse the actual view logic that you create. The representation of the view.

With this implementation, there was a project called yoga, which is the custom layout engine that React Native uses. It's written in C++, and it's pretty portable. So that's exactly what they did, is they compile the yoga to WebAssembly and boom, there you go. You can run React Native in your browser. Pretty cool project.

But if you are a web developer, if you are doing JavaScript today, chances are you're actually using WebAssembly and you don't even know it, which I think is pretty exciting to hear. I know I was surprised. That's because of the source map package. If you use Bubble, if you use LESS, if you use Webpack, if you use Firefox, if you use pretty much any kind of compiler or thing that deals with languages whatsoever with JavaScript, you're using source map because it's by far the de facto standard. And what the Mozilla folks did, it was originally written in JavaScript. They went and ported it to just normal, idiomatic Rust, because they love Rust so much because they created it. And they saw an almost 11 times faster improvement in source map generation.

I don't know about you, but some of the apps I've worked on, the source maps have actually been the worst culprit of my terrible build times. I would very often disable source maps just because it was just taking way too long. So that's a very, very welcome improvement in my build times. So pretty exciting. But those other use cases that I've been kind of hinting at. You want to write Elm, you want to write Dart, you want to write Reason, you want to write Java, you want to write .net, whatever it is, and you want to target the web, those use cases are coming and they're a stated goal.

Binary Stuff

Let's take a look at that binary stuff. I was just showing that example before. We got that factorial function. I compile that to binary. You might be looking at that and you might be just saying, "What the heck? This is too scary. How am I actually going to be able to target for the web? I'm so used to being able to view my source code and do I need to learn binary to use WebAssembly?" And I absolutely agree. Binary can be a little intimidating. None of the people I know who deal with binary all day long, like these type of things, we don't know what the binary instructions are. We don't memorize the instructions. I'm sure there's some people who are very proud of doing that, but I'm not one of them. I don't look at that and go, “It's like the matrix, I don't even see the code. I just see a function in this.” It doesn't work like that.

The pro tip is, don't worry about that. If you've written Java, if you've written C++, if you've written any of these higher level languages, how often did you actually decompile your binaries and start looking at that code? Hopefully, not often, if ever, but the idea is that tooling will make this a non-issue. That tooling will save you from all the issues with debugging and whatnot. Now, it is true that we're pretty early. So it is helpful to have at least a high level understanding of WebAssembly itself. You don't have to memorize the instructions, but there's a textual representation that's going to save you on this. So instead of looking at actual binary op codes, we can instead look at a text or representation on assembly language, if you will.

This is that factorial function written in the textual representation of assembly. You may be looking at that now going, "Okay, I see if, I see return, I see the word factorial with is nice, but I still don't actually understand how this works.” It actually is kind of a little convoluted.

Learning the Fundamentals

Let's actually learn the fundamentals right now. I think it's easier than you would think. We need to start by understanding that WebAssembly is a stack machine language. So what's a stack machine? Well, this might be CS101, but I think it's really critical to understand this point. A stack is a data structure in which there are two operations. And that's it, you can push or you can pop. Every time you push, it goes on top of the stack. You can't push into an arbitrary position and to remove something, you can pop it. Anytime you pop, it's always from the top of the stack. You can't arbitrarily insert or remove anywhere from the bottom, in the middle, or anything like that. So a stack machine is a machine in which the instructions are evaluated on an implicit stack. It's actually part of the machine itself. It's not some separate thing that the language has invented. It's stack machine.

The next question, if you're familiar with stack machines, why would they choose a stack machine? Why? Well, the history there was, first, it started off as an AST machine, which is kind of a weird thing. And then there's SSA, single - I'm trying to remember- single something assignment, single static assignment, I believe it is. And then there's register machines. Register machines are the typical ones. Your X86, your arms. That's how a normal machine works.

And this one is kind of a talk on its own. You could probably write a paper on this about why stack machine versus registered machine and all these sorts of things. And ultimately, I personally believe because it all comes down to personal, that it depends. That there are different use cases, there are pros and cons to both. Ultimately, you have to decide one. Virtual machines are typically done as stack machines, but some of the things you can't talk about are things like smaller binary encodings. It is pretty easy to prove that stack machines can give you a smaller binary encoding than a register machine.

But in the case of WebAssembly, there was also some gains that they were able to see when they were finalizing the decision to go stack machine, that it was going to be easier and faster to do a single pass of verification, and then also the compilation to that target machine code, and be able to implement that virtual machine. If this is something you want to talk more about, if you really want to know more about that, absolutely come talk to me after because I'm happy to discuss it. But let's break down an example. We've got one plus two. We want to add two numbers together. How would we do this in WebAssembly? Well, there is the add instruction that we looked at earlier, the add two 32 bit integers together. The left is the mnemonic. This is a text representation of that op code. And then on the right is the actual hexadecimal of that binary. So how would we write one plus two in WebAssembly? Just like this. We've got one, two, and then add.

But let's evaluate this. Let's create our own virtual machine here with a stack and evaluate this one by one. So what the machine is going to do is, it's going to evaluate the first instruction, it sees it's a 32 bit integer, a one, we’ll put that on the stack, it will see the second one, it'll put that on top of the stack. Notice that went on top. So here's our stack machine being evaluated. And then it's going to come to the next instruction. That next instruction is the add. Then implicitly, the machine knows somehow how to add. We don't care how it does add, but it knows that what it's going to do is it’s going to take the top two operations, pop them off the stack, somehow magically add them together, and then put the result back on the stack, that three back on the stack. See how that's evaluating as a stack implicitly? That's a stack machine.

How would you then log this out? Let's say you've got a function that takes a 32 bit integer and you just want to log this out. Where would we put that? We don't have parentheses or anything like that to group these things together. Well, we would just put it right at the bottom, because it's a stack machine. So what was on the stack right before that call is the result of that, that number three, and that result would get passed to the call of the log, because that's its argument.

So that's the very basics of WebAssembly. It really doesn't take too much to go from here to fully understanding all the instructions. Because really, it's just like, “Okay, what's the instruction for that? I want to multiply, I want to load from memory, what's the instruction for that?” So it's actually pretty easy if you do want to take a day and learn, “I want to write some WebAssembly by hand.” So that's one thing you can put on your LinkedIn profile.

What’s Missing?

But what's missing, right? So there are caveats, it is early. I did make a point to point that out. Well, the biggest thing I think right now for general adoption for everyone, is direct access to the host APIs. In the browser, that means the DOM, being able to call the DOM, being able to make Ajax requests. Right now, WebAssembly itself is kind of a world on its own, where it sits next to JavaScript. And if you want to do anything useful with the outside world, if you want to make HTTP requests, you want to log to the console or access the DOM, you have to go through JavaScript. So when you load up the web assembly module from JavaScript, you basically go, "Here are the functions you can call. Call those functions." JavaScript has to do kind of the heavy lifting with the outside world. And that was done intentionally, because interrupting between JavaScript and WebAssembly with complex garbage collected objects is nontrivial. And so the community group really wanted to get this right, so we kind of punted on the problem and decided we'll just defer that to JavaScript for now.

But the issue with that is that means you have to deship JavaScript, right? If you're doing C++ and you're using something called like enscripting, which is like a clan compiler for WebAssembly, it comes with a runtime that includes a bunch of JavaScript Glue Code essentially and that's not ideal. You want to be able to ship an entire WebAssembly binary without any glue code. There's also a performance benefit, or there would be a performance benefit if you're able to just call the host bindings directly rather than going through JavaScript, because there's a bridge. It's a foreign function interface.

Related to that, though, is the garbage collection. I was talking about the complexities around this; if you're calling host objects, you're getting a reference to the DOM, you're making Ajax requests, what have you from WebAssembly, it has to be garbage collected, because you could give that to JavaScript, JavaScript can get it back to you, vice versa. So we need to expose that garbage collection somehow to WebAssembly. And there's been proposals on both of these things, but actually everything I'm talking about today, there are already proposals that are pretty far. So we have a general idea of how this is going to work, how the garbage collection is going to work.

There's a benefit to this as well. The benefit is that if you have a higher level language like Go, Elm, Dart, etc., they won't have to ship their own garbage collector if they don't want to. They can use the built-in garbage collector and save a bunch of bytes and have a really great garbage collector in fact. Now, some languages will choose not to do that because maybe the garbage collector doesn't fit with their programming style because most browsers is generational garbage collector, multi-generational garbage collector, in fact. But yeah, I think a lot of languages will end up using the built-in one for a number of reasons.

The next one is multithreading. This is another reason why people are really excited about WebAssembly. We're going to get real multithreading. So you'll be able to take advantage of that, unlike in JavaScript, where the only thing you have is message passing between workers and serialization. So that's pretty exciting. There's something called SharedArrayBuffer that is required though to do this, under the current way that projects like enscript can do it. And that was actually disabled back in March or April, I think it was, because of the Spectre and Meltdown exploits. But it's actually making a comeback thankfully. Chrome 68 already out in the wild has support for SharedArrayBuffer, and Mozilla, Edge, and Safari are not far behind. So we'll see on that. But eventually, we're going to get full multithreading in any arbitrary language, which is really exciting.

The next one I wanted to talk about is SIMD, single instruction multiple data. And this is basically hardware parallelization of vector computations. When you’ve got a vector and you want to basically do computation on the entire vector. Maybe it's adding a number, shifting a bit to an entire vector. There are instructions built into most modern CPUs, these SIMD instructions that will be able to take advantage of from WebAssembly, which is pretty unique.

Moving along is that zero cost exceptions that I was talking about before. That's something that also we're going to get, we're going to have zero cost exceptions for any of those languages. And I wanted to point out one thing that's really exciting. I'm hoping we have a talk, a panel later in the afternoon, about languages in general, and I'm hoping we'll get a chance to talk about it. There's typical algebraic effects, and it's a terrible name because it sounds super snooty and super academic, but it's a really cool feature that some languages have been toying with like OCaml and I'm pretty excited about that. There's actually been even talk of supporting algebraic effects from the WebAssembly level so that you could have that in some of your higher level languages. There's a bunch of more proposals that are advancing pretty quickly.

How do I get started with WebAssembly? Well, you can start at webassembly.org, but this is pretty low level stuff. This is going to talk about the specification mostly. Instead, if you want to get started with WebAssembly, as a consumer, you're going to want to go to probably Awesome Wasm. I think it's a great repo that has just resources for everything, different languages, links to all the different languages, tutorials, resources, blog posts, all that stuff. I think it's really great. Supported in all modern browsers, I wanted to point out. So except for Internet Explorer 11, obviously, it will never be supported in them, unfortunately.

The Revolution is Just Beginning

But moving along, because I'm kind of running out time. But I just wanted to point out that the revolution is just beginning. This is a revolution and there's kind of a trick. I've been kind of lying to you; lying to you in the sense of I've been trying to simplify things, and that's that last part. I've been saying it's an efficient and safe low level byte code for the web. However, this is the first open and standardized byte code. I didn't say open and standardized byte code for the web.

If we go into the actual web assembly specification, and we search the word browser, or web, or anything like that, we're going to find one spot where it says platform independent, can be embedded in browsers, but it could also be run as a standalone virtual machine, or integrated into other environments. That's the most exciting thing to me about why it can be so revolutionary, and that's that WebAssembly is not just for the web. The community group when they first started this, and I wasn't involved at the beginning, but they knew this, they saw that this is unprecedented. We got all these companies to finally agree to create this byte code. Can we make this generic? Can we make this something that's not just for the web, that other people can reuse? There's an inside joke in the community. And that's WebAssembly is neither web nor assembly, hence the title of the talk. That's because it's not just for the web, and it's actually technically not an assembly language. It's a byte code.

There are already cases where WebAssembly is being used. There's something called ewasm or Ethereum-based Wasm, that's that crypto currency smart contract things. I actually have no idea of blockchain stuff. Don't ask me about this but I can point you in the direction of people who know all about this. In fact, I think there's a couple people here, but it's really cool because they have this custom virtual machine. I understand the high level of the technical on ewasm. They have their own custom virtual machine to run these smart contracts and one beautiful thing about WebAssembly is that it has very minimum non-determinism. It's only a couple places and those places are very specifically called out. So what the ewasm folks did is they basically said, okay, we'll just not allow those specific non-determinism factors and then add a couple instructions of our own for this concept of this gas or fuel for their smart contracts. But then they know they can deterministically run code and know what it's going to do and how long it's going to take to do it, so that they can charge them essentially for those contracts. That's my understanding of it.

The next one is Wasm Git and this one is just wild and blows my mind. But if you're familiar with how operating system security works, there are different rings of securities. One of the cool things about the fact that WebAssembly is so sandboxed is that theoretically, you could run it in ring zero, you could actually run it in the kernel itself, and theoretically, be safe. Now, it remains to be seen if this is a good idea or not, because we've got decades telling us this is a bad idea. But rethinking best practices, I think, has given some of the best things in software engineering period. So I think it's really cool that people are already experimenting. Ryan Hunter is actually an engineer at Dropbox. So pretty excited. He's interested in running web servers in WebAssembly at ring zero, because if you run it in ring zero, you can double the performance if your application is system called bound. If it makes a lot of IO, like a web server does, you can get significant performance benefits at the risk of potentially letting HTTP requests hijack your entire server.

So it remains to be seen whether this is a good idea or not. But case in point, another person has taken this even a step further. They've made a micro-kernel that runs WebAssembly. That's the only thing it does; is it runs WebAssembly and it does it good. It won a contract for Google Summer of Code. He worked on that last summer. Pretty cool. Again, it remains to be seen whether this is a good idea or not. But we already have lots of people who are experimenting with this thing. So efficient, safe, and low level byte code for the web- strike out that last port, and I hope that this excited at least some of you. If you have questions, I can absolutely make myself available.

Questions & Answers

Phelps: One thing I want to point out is, these talks are tough because I want to be technical, so if there's a technical question that's burning you, you're just like, "I really want to know nitty-gritty," absolutely, come see me. I nerd out about this stuff. And I would love to talk to you about it.

Participant 1: I'm old enough to have worked in a language called Forth.

Phelps: Oh, yes, the stack machine. If you're learning stack machines, you write a Forth machine.

Participant 1: That's what I meant.

Phelps: Actually it's funny, we should talk after actually because it looks even more like that if you saw the full thing.

Participant 2: I actually have written WebAssembly in the textual thing to experiment with speed versus JavaScript. And it's [inaudible 00:51:09] and I can't get it to go faster than JavaScript. There's too much bit twiddling going on and too much communication across. What is the best possible way to speed up an application like that?

Phelps: I don't know if I mentioned this, because sometimes I miss this, but there are actually going be cases right now where WebAssembly actually will be slower than JavaScript. Because right now, most of the virtual machines for WebAssembly don't do continuous just in time compilation, like learning, “Okay, you told me it was this shape, but it actually is this shape,” and these sorts of things. There are certain tricks you can do with JavaScript, after you've gotten it hot, where you've detected this is the pattern and this is the shape and you can compile those things away. As far as specifics, it depends on the browser. Certain browsers like Mozilla's Firefox, is by far right now currently the best in performance on WebAssembly.

It's actually pretty easy to make a benchmark in there where WebAssembly beats JavaScript in most things. If there's a particular thing that you're doing, like what type of thing you're doing, we can talk afterwards and I can give you some tips on improving that performance. The biggest issues that people come across is the bridge. So if you're calling from JavaScript to WebAssembly, if that's the primary thing you do, if you make a micro-benchmark and on every iteration of the loop you call in from JavaScript to WebAssembly, your benchmarks going to show that WebAssembly is slower because what's actually slower is the bridge and Firefox has done a ton of work to improve that, but there's still work that needs to be done as well in that.

 

See more presentations with transcripts

 

Recorded at:

Mar 05, 2019

BT