Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Presentations Fine-Grained Sandboxing with V8 Isolates

Fine-Grained Sandboxing with V8 Isolates



Kenton Varda explains how Cloudflare built a compute platform using V8 isolates instead of containers or VMs, achieving 10x-100x faster cold starts and lower memory footprints. He goes through technical details of embedding V8, distributing code, scheduling isolates, resource management, and security risks.


Kenton Varda is the architect of Cloudflare Workers, a "serverless" compute platform which distributes the code to 165+ locations globally so that it always runs as close to the client as possible. Prior to joining Cloudflare, he created and Cap'n Proto. Further back, while at Google, he wrote Protobuf v2 and open sourced it.

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.


Varda: I'm Kenton [Varda]. I'm going to tell you today about how we built a platform, a cloud compute platform designed for massive multi-tenancy without using virtual machines or containers, but instead using V8 isolates. Let me start by explaining what it is we were trying to solve here. So, Cloudflare operates a network of servers located in 165 locations worldwide and growing all the time. Historically what these servers could do if you put them in front your website, they'd act as a glorified HTTP proxy. They could give you HTTP cashing, they could give you detecting and blocking security threats, famously they give you DDoS mitigation.

But, a few years ago Cloudflare thought, well, it would be really cool if you could just run code on these machines, give us a piece of code to distribute, and have it run all those places and handle HTTP requests for your site directly there. This could be either to augment the functionality of Cloudflare itself to implement new features that we hadn't implemented yet, or it could be to actually take entire applications and run them across the whole network such that you're not really thinking about where it runs anymore. It's just runs everywhere, kind of like what I feel like the cloud was supposed to be all along rather than choosing one or two locations.

But there's a challenge in this, there's a scalability challenge. It's not the scalability challenge that we normally think about. Normally we think about scaling to traffic, the number of requests that we can handle per second. Scaling to traffic is actually really easy for Cloudflare, because every time we add a new location, it means we can handle more traffic, add more service to that location, can handle more traffic. It's evenly distributed across the whole network, our architecture scales out very nicely.

But the kind of scalability I'm talking about here is scalability to the number of tenants, the number of applications that we can be hosting at one time. The challenge with this is that, again, we don't want people to choose one or two or five locations where their software runs; we want it to run everywhere. We want everyone's code running in every one of our locations. And some of our locations are not that big. Though some have lots and lots of computers, but others have maybe a dozen machines. How do you fit- we have 10 million customers- on a dozen machines? Turns out that the existing technologies for this, the existing server-side technologies, don't live up to the task. What we really need is basically 100X efficiency gain in number of people we can host and 100X decrease in how many resources each one is using.

Quick intro on me. I worked for Google for a long time where I was best known for open-sourcing protocol buffers. I created version 2 which is the open-source version. After I left Google, I created Cap'n Proto as I mentioned, which is an alternative serialization and RPC framework. Then I founded a company called Sandstorm that was doing interesting things with decentralized web hosting or hosting of web applications and security around that. The company made some cool tech, but I kind of failed on the business side in a classic fashion in the couple of years ago. I was looking for something new. I talked to Cloudflare and they said, "Hey, well, we have this idea. We want people to run code on our edge. We're not sure how we want to do it. Do you want to take this on?" I said, "Yes."

Couple of warnings, this is actually the first time I've spoken at a conference, I'm not an experienced speaker, and I am also not a graphics designer. My slides, as the talk goes on, are going to get worse and worse, and you're going to want to avert your vision at the end.


Getting back to efficiency, what kind of efficiency do we need here? Well, first of all, we need apps to be very small. We can't be shipping around even 100-megabyte app images because we can't fit enough of those on one machine to host the number of apps that we want. We want apps that are more like a megabyte or less in code size. We want the baseline memory usage, that is the amount of memory the app uses when it has just started up, and it's not doing anything in particular, has to be very low so that we can fit many of them.

Context switching, this is interesting. In our environment because we have requests distributed across the whole world, although we need to host a lot of tenants in each location, each one of them is not going to be getting very much traffic, because they're only going to be getting the traffic that originates in that local part of the world. So what that ends up meaning is that we are actually diffusing our traffic across all of the tenants on the machine, potentially every request handling tens of thousands of requests per second with a new tenant each time handling that request. That means that context switching overhead becomes a problem for us. It isn't for basically anyone else. A big VM cloud provider will usually pin your VM to a core, and it just runs on that core and it doesn't switch at all. We're at the other end of the spectrum, to the point where even switching processes can be too much of a problem because of all the caches that get flushed in the CPU. So we need to potentially put lots of tenants in a single process.

Then finally, startup time. If we can get startup time to be really, really fast then we don't have to worry as much about memory usage, because we can just kick out the tenants who aren't currently handling traffic and start them back up again when they're needed. Ideally, we'd like something that's just a couple of milliseconds, so it's not even perceptible that we've initiated a cold start when the request comes in.

Other Use Cases

Now, we're not the only ones who need this stuff, just to give you an idea of some other use cases. If you have an API that you're exposing to the world over the web, especially like a server-to-server kind of thing, the clients of that API might not like the latency incurred by going over the internet to talk to it. They might want to host their code directly on your servers in some way. If you wanted to give them the ability to do that, you probably aren't going to give each of them a virtual machine or even anything heavyweight. You would like something very cheap.

If you're doing big data processing, say you have a gigantic dataset and you have multiple untrusted third parties that want to do MapReduces over this. In big data processing, you cannot bring the data to the software, you have to bring the software to the data. So you need a very efficient way of taking some code from someone and spreading it across all of your machines where the data actually lives.

Another use case is something like web browsers, where people are browsing the internet, they download code from every site that they go to so that the site can be more interactive running locally. But, don't we already have that? We've actually had that for quite some time, about 20 years now. So that's interesting because we've been looking at the server technology and it's too inefficient to work for this use case. But could it be that web browsers have already developed the technology that we need to solve our problem? It turns out that indeed they have.

Web browsers are optimized to start up code really, really fast because the user is sitting there waiting for it to start. They're optimized to allow for application code to be relatively small so that it can download quickly. They're optimized to have lots of separate sandboxes at the same time, not just for separate tabs but each iframe within a tab is potentially a different website and needs a sandbox. It may be an ad network, it may be the Facebook Like button. Those are all iframes. You don't see it, but they're all whole separate JavaScript contexts. And, of course, web browsers have been probably the most hostile security environment that exists for quite some time. If you can hack somebody's web browser, you can do a lot of damage. All you have to do is convince someone to click on a bad link and potentially you can get into all their other websites and so on.

V8: Isolates and APIs

So, this led us to the conclusion that what we want is the technology from web browsers. In particular, we chose V8, which is the JavaScript execution engine from Google Chrome. It seems to have the most resources behind it, basically, which is why we chose it. Though some of the others might work well too. We found that this works great as an engine for extreme multi-tenancy.

Let’s go into the details a little bit. We've been using this word isolate instead of VMs or containers. We now have isolates. What is an isolate? It actually comes from part of the V8 embedder's API. When you build around V8, you're going to be using the C++ interface V8 as a library. It has a class called isolate, and what an isolate represents is one JavaScript execution environment. It's what we used to call virtual machines like JVM, Java Virtual Machine. Now, the word virtual machine has these two meanings and most people mean something entirely different. So we use the word isolate instead.

Now, here's why, or one reason why, isolates turn out to be so much more efficient. In virtual machines, the application brings its own kernel in its own operating system traditionally. You get huge images. With containers, they got so much more efficient because now the operating system kernel is shared between all of the tenants. The applications only need to bring their own code and any libraries and maybe language environments that they build on top of. So they got a lot smaller and less resource-intensive.

With isolates, we can go further. So now we have, traditionally, all these user space things that we can share between all of the tenants of our system. We have the JavaScript runtime, which includes a garbage collector and JIT compiler, some very complicated pieces of code. If we can have only one copy of that code instead of several, that helps a lot. We can provide high-level APIs instead of providing- in containers your API is the system call API that's pretty low level. If we can do something much higher level, we can do things like have the same HTTP implementation shared between all of the tenants. Hopefully, they then only need to bring their own business logic, and not a big pile of dependencies.

But we don't want to just go start inventing a bunch of our own new API's for this. It turns out there are standards. So the browser, as we know, has APIs for things like HTTP requests, traditionally XML HTTP requests. But these days it's better to use the fetch API. What you might not know is that the browser also has standardized APIs for acting as an HTTP server in the, what's called the service worker standard, which lets you run scripts on the browser side that intercept HTTP requests. This turns out to be exactly what we want for our use case. So we didn't have to develop any new APIs of our own. This is great because it means that this code, that runs on Cloudflare Worker, is potentially portable to other environments, especially if some of the other server list providers decide to also support standard APIs at some point.

This is an example here of a complete HTTP proxy server written in about 10 lines of code. And it actually does something useful. This server is checking for incoming requests whose URLs end with .jpg. And it is sending those requests to a different back-end than everything else, something you might see all the time. But what's interesting is there are no imports, no require statements here. This is just using the built-in APIs in the platform and 10 lines of code and we get something useful. That's how we make code footprint so much smaller.

A lot of people lately have been talking about web assembly. So with V8, we get web assembly for free. It's part of V8. We've actually enabled this in Workers, so that means potentially with web assembly, the promise is now you can write in any language, not just JavaScript. There's just a little problem with this currently, which is that now you're back to shipping your own language runtime because potentially, every tenant has their own language that they want to use. So you see people, they want to use go, and so now they're shipping the go garbage collector and the go green threads implementation, and all of the standard libraries for go. And it's very large, and it goes over the various limits.

This is not solved yet but it will be. What we essentially need here is a way to do dynamic linking on web assembly modules, so that the go 1.11 runtime could be something that we share across multiple isolates. Then each one brings its own application code on top of that. The good news is we're going to be working on that. If you go to Ashley Williams's talk tomorrow, she'll tell you all about what we're going to be building on this to fix this.

Resource Management

You can start to see why this is part of the operating systems track. It's looking a bit like an operating system. It's about to look even more like an operating system. Another thing that we have to do: how do we figure out when to start up apps, make sure they don't use too many resources, and so on? In a traditional operating system, you have a bunch of processes, they're using memory. They kind of allocate the amount of memory that they want, and the operating system has to live with that and hope that everything fits in memory. Because if it doesn't, then it has to take drastic measures. In Linux, there's something called the oom killer, out of memory killer, that kicks in when you run out of memory, and it tries to choose the least important process and kill it. Doesn't always choose correctly and it's a problem because these processes have state.

In our environment, these isolates are essentially stateless. When they're not actively handling a request, they don't have any other state that's important. We can kick them out at any time. So we end up with a completely different memory management strategy, which is we say, "Okay, we can set by configuration that we're going to use eight gigabytes of memory. We'll fill that up until it's full, and then we'll evict the least recently used isolate to make sure that we stay under that eight gigabytes." It's pretty neat to know exactly how much memory your server needs to use. Makes a lot of things easier.

Now we have this trade-off basically, between memory and CPU, because if we have too many customers cycling through too often, then we'll be restarting isolates that we recently evicted too often. But it's a sliding scale and we can monitor it over time. There's not going to be an emergency where all of a sudden we're out of space, and then we can bump up the memory when we see that there's too much churn happening.

We need to make sure that an isolate can't consume all of the resources on a system. There are a couple of ways that we do that. For CPU time, we actually limit each isolate to 50 milliseconds of CPU execution per request. The way we do that is the Linux timer create system call lets you set up to receive a signal when a certain amount of CPU time has gone by. Then from that signal handler, we can call a V8 function, called terminate execution, which will actually cancel execution wherever it is. If you have just a wild true open race close brace infinite loop, it can still cancel that. It essentially throws an uncatchable exception, and then we regain control and we can error out that request.

An isolate in JavaScript is a single-threaded thing. JavaScript is inherently a single threaded event driven language. So an isolate is only running on one thread at a time, other isolates can be on other threads. We don't technically have to, but in our design, we never run more than one isolate on a thread at a time. We could have multiple isolates assigned to one thread and handle the events as they come in. But what we don't want is for one isolate to be able to block another with a long computation and create latency for someone else, so we put them each on different threats.

Memory is interesting. V8 has a way for you to say, "I don't want this isolate to use more than this amount of memory, please stop it at that point." The problem is when you hit that limit, it aborts the process. That means we've aborted all the other isolates on the machine as well. So that's not what we want. Instead, we end up having to do more of a monitoring approach. After each time we call into JavaScript when it returns, we check how much use space it is now using. If it's gone a little bit over its limit, then we'll do a soft eviction where it can continue handling in-flight requests. But for any new requests, we can just start up another isolate. If it goes way over then we'll just kill it and cancel all the requests. This works in conjunction with the CPU time limit because generally, you can't allocate a whole lot of data without spending some CPU time on that, at least not JavaScript objects. Then type trays are something different, but it's a long story.

Another problem is we need to get our code, or the user's code, to all the machines that run that code. It sure would be sad if we had achieved our 5 millisecond startup time only to spend 200 milliseconds waiting for some storage server to return the code to us before we could even execute it. So what we're doing right now is actually we distribute the code to all of the machines in our fleet up front. We already had technology for this to distribute configuration changes to the edge, and we just said code is another kind of configuration, and threw it in there and it works. It takes about three seconds between when you upload your code and when it's on every machine in our fleet.

And because the code footprint of each of these is so small, this is basically fine so far. We have enough disk space. Now, it may come to the point where we don't at some point, and then we'll have to make the tradeoff of deciding who gets slower startups because we need to store their code in a more central location. But it probably would be a per [inaudible 00:22:39] thing instead of every single machine thing, and so shouldn't add too much latency.


Let me get to the thing that everyone wants to ask me about, which is security. There's a question as to whether V8 is secure enough for servers? You'll see actually some security experts saying that it isn't. Surprisingly enough, some people at Google saying that it isn't. What do they mean by this? Well, here's basically the problem. I said, my slides are going to get ugly, they've gotten ugly. It's ugly not just visually but also for the content. V8 has these bugs. In this particular case, this is two lines of code from deep in the V8 optimizer where these two lines basically say that the function Math.Expm1, which calculates E to the power of x minus 1. I'm not good at math so I don't know why you want that, but I'm sure there's a reason. This line says to the optimizer that it returns either a plane number or NaN, not a number.

Turns out though that it can also return negative zero. For some reason, negative zero is not a plane number in V8 type system. As a result, people were able to exploit this one little error to completely break out of the V8 sandbox, by basically tricking the system into thinking something was a different type or triggering an optimization that shouldn't have happened. The details are complicated and really interesting. If you want to know more, check out this blog post. It's very understandable. You don't need to know V8 details but this guy, Andrea Biondo, wrote it all up. It's very interesting.

So that sounds pretty bad. You can imagine that there's going to be lots of bugs like this in V8. It's a big, complicated system. The assertion is that because of this,V8 is not trustworthy enough, whereas say, virtual machines and maybe containers are more trustworthy because they have a smaller attack surface. Well, here's the thing, nothing is secure. Security is not an on or off thing. Everything has bugs. Virtual machines have bugs, kernels have bugs, hardware has bugs. We really need to be thinking about risk management, ways that we can account for the fact that there are going to be bugs and make sure that they have minimum impact.

The frequency of bug reports in V8, there are two ways to look at it that could make it bad or good. V8 has relatively more bugs reported against it than virtual machines. That's bad because it's showing that there's a larger attack surface, there are more things to attack. But there's also a good side to this, which means that there's a lot of research being done. Actually, the vast majority of V8 bug reports that like I have access to before the rest of the world. I look at them and almost every single one of them is found by V8's own fuzzing infrastructure, it's found by Google essentially. They've put an amazing amount of effort into this. I just learned actually recently that not only does V8 have a bug bounty, where if you find a sandbox breakout, Google will pay you $15,000, maybe more- if you're going to use it to exploit someone, you need to be getting more than that out of it, right? But they also have a bounty for fuzzers. If you write a new fuzzer and add it to their infrastructure, a new test case basically, and it finds bugs, they will pay you for those bugs.

That was really interesting to me, and people do this. Every now and then someone will submit a new fuzzer, and it'll find a bunch of new things and they'll get paid out, and this is awesome. How much has gone into this? On the other hand, like if you're looking at a security solution and it has no bugs ever reported against it, you don't want to use that. Because what that means is no one has looked, no one writes bug-free code. So this is why I'm feeling fairly comfortable about this.

Now let's talk about risk management. How can we limit the damage caused when a bug happens? There are things that you may do in your browser today to protect yourself against browser bugs, and some of them apply to the server as well. Obvious one, you probably install Chrome updates as soon as they become available. Well, we can do something on the server that's even better. We can see when the commit lands in the V8 repository, which happens before the Chrome update, and automate our build system so that we can get that out into production within hours automatically. We don't even need some want to click.

Something that probably fewer of you do on the browser, but I'm sure a few of you, is use separate browser profiles for visiting suspicious sites, versus visiting your important sites. This is actually really easy to do in Chrome. There's great user management also in other browsers as well, or some people prefer to just use separate browsers. But we can do something similar to this on the server. We don't have the ability to spin up a process for every single tenant, but we can spin up a process, like one for enterprise users, one for established users who have been paying for a while, and one for free users when they come. We don't currently have a free plan, but if we were to in the future. Then, we can put additional isolation around those, we can put those in a container or in a VM or whatever else we want. So that makes it pretty hard for an attacker to just sign up and get something good.

There are some things, some risk management things we can do on the server, that we cannot do so easily on the browser. One of them is we store every single piece of code that executes on our platform, because we do not allow you to call eval to evaluate code at runtime. You have to upload your code to us and then we distribute it. What that means is that if anyone tries to upload an attack, we now have a record of that attack. If it's a zero-day that they have attacked, they have now burned their zero day, when we take a look at that code. We'll submit it to Google, and then the person who uploaded won't get their $15,000.

We can do a lot of monitoring. For example, we can watch for segfaults anywhere on any of our servers. They are rare, and when they happen, we raise an alert, we look at it. And we see in the crash report, it says what script was running. So we're going to immediately look at that script, which we have available. Now, Chrome can't really do this, because it can't just upload any script it sees, because it's potentially a privacy violation. They can't investigate every crash report they get, because the browser is running on so many different pieces of hardware, some of which are just terrible. They get a constant stream of these crash reports. It can be terrible hardware, it could be that the user has installed malicious software already, and it's trying to modify Chrome. That happens a lot and that causes a bunch of crash reports and all these other things. So they have a much harder time actually like looking for the attacks.

What about Spectre, speculative execution side channels? A couple of weeks ago, the V8 team at Google put out this paper that basically said that they cannot solve Spectre, and so, therefore, Chrome is moving towards isolating every site in its own process, instead of doing anything internally. In particular they said timers, timer mitigations, are useless. When this came out, we started getting a lot of people asking us .doesn't that apply to Cloudflare? Ae you totally vulnerable to Spectre?

Well, here's the thing. You have to be careful when you read this paper to understand what it is actually saying. It is saying that, "The V8 team has not been able to find anything else they can do, except for rely on process isolation." It is not saying that process isolation solves the problem. The problem is if you go to the Intel side, there have been a bunch of different variants of Spectre found already. Each time it requires a custom fix. And miraculously so far, they've always been able to somehow fix it through some crazy thing they do in microcode or whatnot. Usually there's a gigantic performance penalty, sometimes people say the performance penalty isn't worth it.

But they're not done. There are going to be more bugs, they're just not found yet. We don't know for sure if all of these bugs will have mitigations that are easy, and mitigations that are easier than buying new hardware. It's kind of scary. When you talk to the people who have been researching a lot of this stuff, they're like, "I don't know. We could see a bug that breaks out of virtual machines and not have anything to do or anything we can do about it."

But in Cloudflare's case, we actually have some things we can do that basically nobody else can do. We're taking an entirely different approach here. In our API, we have actually removed all timers from our API. We can do that because we don't have any backwards compatibility legacy that we need to support, and because our API is at a much higher level, to the point where applications don't usually need to time things. If you're implementing say a new P threads library, you need access to a high position timer to do it well. or a new garbage collector, you need a high position timer for that. But we provide those in this platform. The application only does business logic stuff.

The application can still ask what time it is. It can call, but the value that's returned by that does not advance during execution. So it essentially tells you when the last network message was received. If you check it and then run a Spectre attack in a loop and then check it again, it returns the same value and the difference is zero, so you think that you just ran it infinite speed.

We also don't provide any concurrency primitives because any kind of concurrency can usually be used to build a timer by comparing your own execution against whatever happens in the other thread. That's another thing that browsers can't do. They have this platform they need to support that already has explicit concurrency in it, and also has implicit concurrency in rendering. You start some rendering and then you do something, and then you check how much the rendering has progressed. In our platform, we can eliminate all of those.

Now, it is of course still possible to do remote timing. The client can send a request to their worker and see how long it takes to reply. That's over the internet. There is noise in that. but noise doesn't mean it's impossible to attack. No amount of noise will prevent a Spectre attack. The Spectre attack just has to amplify itself until the point where the difference between a one bit and a zero bit is larger than the noise window. But the noise does lower the bandwidth of the attack. It can lower it far enough that now we have an opportunity to come in there and notice that something fishy is going on.

We can look for things like high cache misses or other telltale signs that someone is doing something fishy. Then we have this other superpower which is that because the isolates are stateless, we can just move them at that point to another process or to another machine, let them keep running. So if we have a false positive, that's fine, the worker will continue to do its job. But now, we've taken the attackers and moved them away and everyone else is potentially safe. But this, as I said, it hasn't been tried before. There's a lot of research to do and we're going to be working with some of the foremost researchers in speculative side channels to check our work here. There'll be announcements about that soon, once we have the details worked out.

Big Picture

But I'm backing up a bit. We could just say, "Oh, there's challenges here, it doesn't work, let's not do this." But we can't, because there's too much value here. The history of computing, of server-side computing in particular, has been of getting to finer and finer granularity. Virtual machines started being used they weren't thought of as secure. But virtual machines now enabled public cloud which is clearly incredibly valuable. Containers have had their naysayers, but they enable microservices, which are incredibly valuable as we saw in the keynote this morning. We can't just say it doesn't work. We have to solve the problems.

With isolate computing, we have the potential to handle every single event in the best place for that one event to be handled, whether that's close to a user or that's close to the data that it's operating on. That's going to change everything about how we develop servers. You're not going to think about where your code runs anymore. It's a lot less to think about, but then everything is going to be faster to. Imagine this. Imagine you have an app that uses some third-party API that's also built on this infrastructure, and another API is built on other APIs, so you've got this whole stack of infrastructure. Imagine that can all actually run on the same machine, all on the same machine which itself is located in the cell tower that's closest to the user. That would be amazing. That's what we're going for here.

Questions & Answers

Participant 1: Great talk, thank you. I have a question about utilization, I mean CPU utilization. If we talk about scenarios like proxies, then probably the isolates do not do anything most of the time, just waiting for a response from a remote system. So do you run thousands of these threads with isolates in parallel, or do you have threadbare core and so the CPU is almost free? The second question is related: do you have any accessory bar related to latency, like minimal or maximal latency before your screen will up in running?

Varda: As I said earlier, we start up a thread or we have different isolates running on different threads. We actually start a thread for each incoming HTTP connection, which are connections incoming from an engine X Server on the same machine. This is kind of a neat trick because engine X will only send one HTTP request on that connection at a time. So this is how we know that we only have one isolate executing at a time. But we can potentially have as many threads as are needed to handle the concurrent requests. The workers will usually be spending most of their time waiting for some back end, so not actually executing that whole time. Does that sort of answer your first question?

Participant 1: But then this case, you can go, I don't know, 10 requests, which consume all the CPU and then other requests will be just waiting, and you'll have high latency. I mean, how do you view CPU sharing between thousands of requests?

Varda: Right. First of all, you're limited to 50 milliseconds per request, and we cancel them after that. But it's still possible if there's enough isolates and enough requests, you could run out of total CPU. Basically that's a provisioning problem. We need to make sure that we have plenty of CPU capacity in all of our locations. When we don't, when one location gets a little overloaded, what we do is we actually shift. Usually the free users we just shift them to other locations- the free users of our general service, there isn't a free tier of workers yet. But that offloads CPU to other places without affecting any of the paying users in any way. So we've been doing that for a long time, it works pretty well. And then the second part of your question?

Participant 1: It was about latency, SLA’s.

Varda: So yes, SLAs, I don't think we have a specific SLA for latency. Well, actually, I don't know. That might be something that is someone else's department. Okay, doesn't have one, but it's usually pretty good.

Participant 2: You mentioned in the beginning that customers can use this to augment Cloudflare functionality. You also mentioned that you store and inspect user's code. What kind of protections do you have to allay customers fears that you will just steal that code essentially?

Varda: We look at code only for debugging and incident response purposes. We don't dig through code to see what people are doing for fun. That's not what we want to do. We have something called the Cloudflare app store, which actually lets you publish a worker for other people to install on their own sites. It's being able to do it with workers is in beta right now. So this will be something that will ramp up soon. But then you sell that to other users, and we'd much rather have people selling their neat features that they built on Cloudflare to each other in this marketplace, than have us just build it ourselves. There's so much more we can do that way. We'd rather focus on the core network platform and on building more servers, than trying to come up with everything under the sun that people can build on it.

Participant 3: All these tools that you're creating, are they going to remain proprietary Cloudflare things for your platform? Or are you going to actually start to maybe open-source some of these tools for other people to use them to do similar things or to benefit from?

Varda: We don't have specific plans yet, but I can tell you personally, I would very much like to start open-sourcing parts of this, probably in stages. We have this great glue layer that we use for binding APIs written in native code into JavaScript so they can be called. I would like to do that. Can't make any announcements right now.

Participant 4: Is there any ability or thought about being able to store some sort of state on the edge? Because you're basically just processing data as it passes through. Is there a future where you can do some sort of fancier processing right there?

Varda: Storing state on the edge. We have a number of projects that we're working on with the goal of eventually every user’s data, if you build an application on Cloudflare storage, then every user's data, every user of your application, their data should be stored at the closest location to them. I have this thought experiment I to think about, which is when people go to Mars, is the internet still going to work? Can you use web apps from Mars? On today's model, no, because you're going to wait for half an hour round trip to do every page load. But if we send a Cloudflare pop to Mars, and an application were written on Cloudflare storage, would people then be able to use it as long as they're only collaborating with other people on Mars?

If we solve that problem, then we've also solved the problem of slow internet in New Zealand. So it's important here too. But there are a number of efforts underway. One of the first ones that's already in beta is called Workers KV. It's fairly simple right now, it's KV Store but it's optimized for read-heavy workloads, not really for lots of writes from the edge. But there are things we're working on that I'm very excited about but not ready to talk about yet, that will allow whole databases to be built on the edge.

Participant 5: Next question. Considering the fact that there are no free services at the moment, what are the other ways to get in touch with technology and to experiment with a little bit?

Varda: Great question. If you go to, you can actually play around with it just in your web browser. You write some code and it immediately runs, and it shows you what the result would be. That's free. Then when you want to actually deploy it on your site, the cost is $5 per month minimum, and then it's 50 cents per million requests. You get the first 10 million free. So it's less expensive for a lot of people than Lambda.

Participant 6: Like you said, one of the awesome things about Cloudflare is it's DDoS protection and handling some of the most hardcore traffic patterns on the internet. Now that we're running JavaScript at the edge, in this controls computing environment, does your DDoS strategy change at all when you get tons, and tons, and tons of load?

Varda: The DDoS protection happens before Workers. Your Worker is protected. So that's one part. There is, of course, an interesting new question here, which is could you use Workers to launch a DDoS on someone else? "Oh, now you've got 165 well-connected locations that can run your code and send last request someone." Yes, we don't let you do that. When people try, they get shut down really quick. That's all I'll say about that, because I have to stop.


See more presentations with transcripts


Recorded at:

Mar 29, 2019