InfoQ Homepage Presentations The Most Secure Program Is One That Doesn’t Exist

The Most Secure Program Is One That Doesn’t Exist

View Presentation

Speed:

Download

45:31

Summary

Diane Hosfelt gives an overview of how Rust’s design gives security guarantees and discusses goals and visions for the future.

Bio

Diane Hosfelt is the security lead for the Mixed Reality team at Mozilla Research and works closely with the Rust Project to improve security with formal methods and unsafe code guidelines.

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Transcript

On April 7th, 2014, OpenSSL released version 1.0.1.g which fixed a buffer overread vulnerability that had been introduced accidentally two years earlier. This was a result of a missing bounds check in the TLS heartbeat extension, which allowed attackers to read arbitrary memory, memory that could contain passwords, it could contain private keys, maybe some social security numbers. And if you haven't quite yet figured out what I'm talking about, I'm talking about Heartbleed. It was one of the most catastrophic software vulnerabilities ever. Estimates for the cost of Heartbleed start at about half a billion dollars but some impacts can't really be measured. 4.5 million patient records were stolen from a U.S. hospital chain and these people can't ever regain their privacy. That's lost. Their medical records are gone out into the world. Afterwards, experts asked how could this happen and how can we prevent it from happening again? And there are a few key points that stand out. Lack of robustness testing, valuing performance over security, and having a complex code base with not a lot of resources to maintain it.

The Safest Program is the Program That Doesn’t Compile

So as software developers and architects, we often talk about designing systems with security in mind. I see politicians and educators and executives calling for more cybersecurity professionals. And I'm not here to say that that's wrong, but I don't think it's the right approach.

I am Diane Hosfelt. I work at Mozilla Research and I believe that privacy should be a fundamental right. Instead of building up a siloed group of security professionals, all programmers need to know something about security. You know, when you're programming courses, when you're first learning, when you're interviewing, everyone asks you, "Well, what's the performance of this algorithm? What's its big O?" Well, why aren't we also asked to analyze the security of our software? Nobody says, "Oh, well, you're missing a bounds check here. What's going to happen?"

Security is equally as important as performance and we need to not think of a security performance trade off. We need to embrace both. So before diving in, [I'm not doing the raising hands, it's after lunch, we need to exercise] stand up if you are a C or C++ programmer. So stay standing if you can write a nontrivial C program that doesn't have a memory error. I'm just going to sit down over here because I am not one of those people. Exactly. Who is familiar with Rust? Stand up. Remember we're exercising, post lunch exercise. Stay standing if you would consider yourself a Rust programmer. And stay standing if you write Rust in your full time job. Just stay standing. It's fine. We'll give it to you.

Participant: I've cut my paycheck into running Rust a couple times.

Hosfelt: That works for me. You can all sit down now. Exercise portion is over. Nap time starts now. So I am going to lay out the guarantees that you get in a Rust code base and then I'm going to backtrack and discuss what mechanisms we use in Rust to actually achieve these safety guarantees that I'm going to very boldly state. And then, finally, I'm going to chat about some ongoing work to make Rust code more secure in 40 minutes. My goal is for you to walk out of this room with a better understanding of how Rust enables writing secure code and when you should or maybe shouldn't consider incorporating it into your projects.

A “Safe” Language

What can we say about a Rust program that compiles? Well, it's memory safe, it's type safe and guess what? It's thread safe, it sounds good. Let's go home. Like, we're great, we're done. But what does it actually mean to be safe first? For today's purposes, a program is memory safe if memory access errors never occur. And I know that if anyone out there is a PL wonk, this is not your preferred definition, but it's the one I'm going for today. So that means we aren't going to have buffer overflows, we're not dereferencing, null pointers. We don't have use after frees. We definitely don't use uninitialized memory and we avoid any illegal frees, like a double free or freeing something that hasn't been allocated.

But why do we care about memory safety so much? Why are we always talking about this? Well, there's a huge overlap between memory safety violations and security related code. I recently looked at the potential impact of rewriting Firefox's style component in Rust. Last year, Mozilla shipped quantum CSS in Firefox. This is the part of the browser that applies all of the CSS rules to a page and it's a top down process on the DOM tree. If you're given the parent style, then you can calculate the child styles independently, which sounds like a perfect use case for writing concurrent parallel code, and it is. Well, Mozilla tried to do this a few times in C, well, C++ rather, and it did not succeed.

So Rust succeeded where C++ didn't because it's designed to make writing concurrent code painless. I never thought that I could write parallel code. I was really, really bad at it. But now, I can actually do it, only in Rust though. I really can't in other languages, still. And the key point here is that quantum CSS resulted from a need to improve page performance, but, as a very happy byproduct, we also improved security. So it's kind of like a win-win across the board, right? Over the course of its lifetime, there have been 69 security bugs in Firefox's styling component. (Sorry, which side is that? I'm really bad with right and left.)

The filled in pie chart is a breakdown of the categories of bugs and we'll get to that in a minute. And then the one on the other side that has a hole in it (sorry), is a breakdown of the bugs that have official security ratings and whether or not Rust fixes them.

What do we see here when we look at this graph for the first time? Well, almost 50% of our bugs were memory-related and then we have another 25% that were bounds-related. Well, that's exactly what Rust is designed to prevent. So I'm going to run through the actual numbers here. We have 32 bugs out of 69 that were memory problems like use after free. This happened all the time - well, not all the time because there weren't that many security bugs, but it was a huge proportion, as we can see. There were 12 bugs that were related to bounds errors. Another two were integer overflows. Three stack overflows, seven bugs related to dereferencing a null pointer or using uninitialized memory and one miscellaneous one that I'm going to ignore. And then we have this other chunk here that I call implementation and that's correctness bugs, things where they are the very classic case like, “Oh, if very important security check, then we're going to do this thing.” Otherwise, you know, let's continue on and I'll revisit this shortly.

But we can look at this and we see that 51 of the bugs would have been prevented by default in a Rust code base. It's almost 75%. And that is a lot of your time as a developer that you don't have to spend fixing these things. And it's also a lot of time that you as a person who uses the internet don't have to worry about crashes in remote code executions. It's a great situation, right?

Let's talk about the security critical bugs. They're rated from low to high or critical. Security-high bugs are generally critical bugs with a mitigating factor or like a history stealing bug and critical bugs tend to be vulnerabilities that would allow things like remote code execution. So we have 43 of them that have an official security rating and this is assigned by the Firefox platform security engineers. And of the 34 bugs that were rated as critical or high, again, out of a total of 43, Rust prevents 32 of those by default. So that's 95% of the worst bugs that you're getting that are related to things that Rust statically prevents or at least prevents with minimal runtime costs.

Anyway, so that's why we care about memory safety so much. That's a huge amount of things and we no longer have to worry about. If anyone who loves types as I do, then you've probably heard this. Now because there's such a strong correspondence between memory safety violations and security vulnerabilities, we sometimes forget the powerful consequences of type safety. A number of vulnerabilities like SQL injection, cross-site request forgery, cross-site scripting, buffer overflows are caused by unvalidated, untrusted input. And I'm going to revisit this shortly, but the Rust paradigm of enforcing input validity using the type system can minimize these problems. And that's what we mean by well typed program can't go wrong as long as you properly define what your type system is preventing. If your type's check, then you're good, you're gold and go home, not quite. Let's return to what we mean by safety. It turns out that the mechanism that Rust uses to enforce memory safety also enforces thread safety. If you like your concurrency paradigm, you can keep it. Supposed to be a laugh.

Thank you. Rust enables message passing, shared state, lock free, purely functional, whatever your preferred paradigm is for concurrency, you can keep it and the compiler is there for you, watching for your every mistake, but in a non-creepy way. It's in a helpful way. And so the way that we do this is we distinguish data types as thread safe or not. And again, more on that soon because we're backtracking to the mechanisms. Usually, languages rely on careful documentation to identify types as thread safe or not, but there's no semantic difference. Rust is all about semantics, because that's what a compiler understands. So we annotate types so that the compiler can differentiate between a type that's allowed to be safely passed between threads. And if you send a type that you haven't appropriately annotated as “Hey, you can send this between threads”, then the compiler is going to error. You aren't going to build and you're not going to have a terrible mistake in your code. That's great.

Rust 101

Now we get into the good stuff. You're saying to yourself, “all right, Rust is magical. There's puppies, there's kittens, we definitely have kittens. Some chocolate. How do I get there?” So we're going to have a very brief overview of how Rust achieves these guarantees. All of my examples come from "Rust by Example," and "The Rust Book," which you can buy in both dead tree or just look it up online. Documentation in Rust is great and will cover many things that I could not possibly cover even if you gave me all day. (I love transitions.)

Our crash course in Rust today is going to cover ownership, lifetimes, traits and concurrency. And then you'll probably see my arrays and null slides, but I highly doubt we have time for them. In fact, I'm sure that we don't. Okay. Rust uses what's called an affine type system. And that's a very fancy word for a concept that is much better explained through my artistic abilities here. At the core, memory safety and concurrency bugs are both caused by code inappropriately accessing data. Turns out this is also a problem with the security vulnerabilities. So to achieve both performance and memory safety, Rust uses a concept called ownership. It's not garbage collected. Rule number one, each value has a variable. This is the owner. There can only be one owner at a time. When the owner goes out of scope, the value is dropped.

See what this looks like. We have a string, s1. We have the value "hello," which is a string and that's owned by the string, s1. Now we assign s1 to s2 and we say that we've moved s1 into s2 and then we decide that we're going to print out s1. Well, s1 doesn't own anything anymore. So that's not going to work. What's one option? (Sorry, I really like option ponds.) Well, what we can do is we can clone and so what clone does is it creates a deep copy of the data. So now we have two distinct instances of this hello string, two different values in memory. S1 points to one of them. S2 points to the other. So we can still use s1.

Now, you've probably heard that Rust creates really ergonomic, easy to read code and it's performant. So sounds like we're cloning everything all the time, right? No, we aren't doing that. What we do instead is we borrow and what that means is we use a reference. Here, we have s1 and we want to pass it to this function, calculate length, but we don't want to have to copy it because there's no need to and we don't want to have to constantly return things all over the place. So we pass a reference to it. And so s1 owns this string for the entire duration of this execution. It lets calculate length borrow it. And while calculate length has the borrow - I've got my water bottle and I give it to Ashley. Well, I can't drink from my water bottle right now. That's mine. Give it back. And that is ownership and borrowing. And as you see, when calculate length is over, this s goes out of scope, which means that the value is then returned to its actual owner, s1, and we can use s1 as we'd like to.

Hands up if you think that this is going to compile. Yes, it doesn't. We also explicitly control mutability. Mutability is explicit. It's checked by the borrower checker. You're not going to accidentally mutate something that you didn't want to, although you might end up putting muts everywhere in your code when you're trying to figure things out. I know I still do sometimes.

Borrowing

So this is just putting in the mutable keyword and now our code works. We're very happy. And we need to talk about what the rules of borrowing are. Once a value is moved out of the variable, it's invalid. If you try to use it, the compiler rejects it. So we use a reference, we borrow it, and we have two rules that matter and one of these is going to become very important later on. The first one, your references always need to be valid. Second one, you can have either one mutable reference or you can have many immutable references. In this way, we prevent having - say we have two vecs, right? We have L1 and we have L2. Now L1 and L2 happen to point to the same memory. So we modify L1 and we don't think that we've done anything to L2 over there, but we actually have a situation where we've mutably aliased these lists together and now we are incidentally changing L2.

And it turns out that this is very confusing whenever you try to reason mathematically about your program. So it's like mut doesn't like it when you change things for no reason.

Lifetimes

The other thing to discuss is lifetime. I'm going to return to this example here. I said before, when a variable goes out of scope, Rust frees that memory. We already know that this won't compile because we've moved the value. If it did compile, what would happen when s1 and s2 went out of scope? They'd both be freeing the same memory. So this is how the ownership and borrowing rules prevent things like a double free.

Sorry, spoilers. We also have code like this. So the brackets perform scoping and we have this variable R and then inside this inner scope, we create X. Now, we want to have R borrow this and spoilers, it's not going to work because we're going to free the memory that we want in the outside scope. That creates a dangling pointer, which is bad. If we removed these brackets and we didn't have an inner scope, what would happen? We would then be using R before it's been initialized, which is also a no-no. So this does not work at all.

Lifetimes get a lot harder. They're actually the thing in Rust that I struggle with the most. I'm really, really bad at them, which is why there's only two slides.

Types and Traits

Let's talk about the type system. There is far too much to cover, so I'm just going to show you an example and then we're going to talk a little bit about how we can use a type system to enforce invariants. Sometimes we talk about Postel's robustness principle and it can be summarized as be conservative in what you send and be liberal in what you accept. And I'm not going to say what my opinion of this is, but Rust APIs generally don't follow this principle. We prefer to enforce validity via the type system. This code here shows a type which is the tweet and it shows a trait summary. Now at the bottom, we see how we get that function summarized to be implemented for objects of type tweet.

And why am I showing you this? Well, we can enforce an invariant like this. We can only notify a type that implements summary. And this is how I tend to think of traits, but they're a powerful and flexible way to control and specify type behavior. We also use them to implement things like debug, which can control how a custom type is output when you're testing. Another example that I like is let's talk about an ASCII wrapper. So we create a type, ASCII, and it wraps U8. What this does, it guarantees that any U8s that you put in this are valid ASCII. Functions will then take type inputs of type ASCII and the type system statically enforces this requirement which minimizes our runtime cost. What this also does is it means that any time we try to convert a U8 to an ASCII, we know where that could potentially break down. It's the point where we put it into the ASCII. We don't have to hunt down all of these things in our code thinking, "Well, where did my input go wrong?" It's very obvious. So we isolate our errors and we provide assurances that data is appropriately encapsulated.

Concurrency

We also use traits for a very important application in concurrency. Thread safety is enforced through the use of two traits, send and sync. While we can use them to define methods to be implemented, they're also used as markers. Send indicates that a type can be moved between threads and sync indicates that it can be shared. Here's the catch: you, as the programmer, have to ensure that send and sync are used appropriately. The compiler cannot figure this out because it has no way to semantically deduce this. However, once you write some code that looks like that, you've given a little semantic hint to the compiler, which it can then propagate and check statically. If you try to send a variable that doesn't implement send or sync to a different thread, the compiler will reject your code. So traits and type checking, they don't just give us the ability to do generics and polymorphism, but this also helps us prevent thread synchronization bugs.

True story. I recently got frustrated on some code I was writing and I kind of just like tossed send and sync implementations in and the reviewer saw this and was like, "Why are you doing this? You know that we have a thread safe wrapper for this type and you should be using that." I had no idea that we had that, made a huge mistake. I did not verify that the structure was actually thread safe before telling the compiler it was. So bad, Diane. The compiler works really hard to keep us from making mistakes. But there are some classes of bugs that Rust doesn't solve. The sends and sync examples one, I the programmer, did not appropriately verify that the data I had marked as thread safe actually was. (Sorry, we do not have time for that, I don't think. I have pretty pictures.)

The Problem with Correctness

So far I've discussed what type safety, memory safety, and thread safety mean for Rust and how we achieve it. That's all mostly settled. Now, I'm going to talk about my goals for the future of Rust. And this is all things that I personally hope for the language and I'm working to make happen. And if anyone's interested, please talk to me afterwards. While memory vulnerabilities account for the majority of the most severe software bugs, there are two other problems that I think we can begin to address. Timing side channels and also implementation errors. What is the problem with correctness? Rust is great. It prevents the worst types of vulnerabilities. I love types. It has types. It makes writing concurrent code painless and easier to understand, but unfortunately, this is only half the battle when we're trying to create a secure system.

There are classes of bugs that Rust explicitly does not and cannot address at this moment, particularly correctness bugs. This is a code sample from the quantum style code, the quantum CSS code rather. So while the CSS code was written in C++, we had a bug up here that was a trivial history stealing bug. And it did not look anything like this because they have two different architectures, but it was the same type of thing. And the fix was that you needed to see if this SVG document was being used as an image. And, well, when they first wrote the code, they forgot this check. It's a trivial history stealing bug, so it's security high, very bad and no way to catch this.

When the rewrite happened, this check was unfortunately forgotten again. There were automated tests to catch visited rule violations like this, but in practice, they did not suffice to detect this. For some reason, we temporary turned off the tests that tested this feature. Tests aren't very helpful if you don't use them. So yes, the risk of reimplementing logic errors can be helped along, if we actually use good testing practices, if we have good test coverage and we actually run them. However, if this hadn't happened previously, this would have taken possibly much longer to discover. If we don't reintroduce but rather just introduce new correctness bugs, the only way to figure them out is, you know, maybe it causes a crash, but this could take years, which means that we will have shipped faulty code for far too long. And I'm going to return to this because first I want to talk about side channels.

Side Channel Mitigation

Side channels are hard. Honestly, side channel attacks are just hard to deal with from a defensive point of view. They're pretty easy to deal with from an attacker's point of view because there are tons and they always seem to work. You build this great vulnerability free application and then all of a sudden, some princess yells out the password through a window that you weren't expecting and everything's broken. Side channels exploit information that's either accidentally leaked or it's indirectly inferred from data that's available through non-nefarious means. Some examples are using the time that it takes to execute a cryptographic algorithm to recover key material. That's the one I'll be focusing on. Using the readings from an accelerometer to attack your phone pen, cache timing attacks, row-hammer. There are a lot of them and they're all terrifying.

Like I said, I'm only going to talk about timing attacks. A common mitigation for timing attacks is that we write what sometime is referred to as constant time algorithms. I prefer secret independent because constant time makes me think that it's going to be really, really fast and it has nothing to do with that. It has no dependencies on the secret, but it has other dependencies on the input size. (Sorry, that's a pet peeve of mine.) Anyways, there's a problem here. How do we actually write secret independent code? Usually, I mean, I like to write my code in a higher level language that I can read and the generally accepted … So what happens then, is we write our code, it's run through a compiler, a git, an interpreter, which does certain optimizations. Even if we think that we're writing code that isn't dependent on a secret, well, the optimizations might kind of ruin that for us. So the accepted mitigation for this is hand optimize assembly. Who loves writing hand optimize assembly? Well, that's more hands than I was expecting. That's fine.

Possible future work in this space and in particular what I think is a really good opportunity for Rust is can we create types or traits that are intended to encapsulate sensitive data and can we then maintain certain invariants using these types or traits. These could be things like ensuring that our memory is appropriately zeroed out, making sure that we have data-independent operations. What's special about Rust? How would it enable this when other languages can't? Well, the way that we check ownership in lifetimes is we statically analyze the code at compile time. If we can build these concepts and these invariants into the type system, then we might have an opportunity to take some of the burden off of ourselves and our brains for those of us who don't want to write assembly by hand, and then the compiler will trace the sensitive values for the code and make sure everything's handled appropriately.

Why do I care about this? I can't read assembly. I'm not even joking. I once had to implement a big num library in X86. This was the warmup project for one of my classes in college. Well, I turned it in and I got a 0 out of 20, and my professor gives it back and he's like, "You can do better." So I redo it, get it back, 5 out of 20. All right, I'm taking it. I'm done. If we can handle things like this and what I consider to be a Diane readable language level, that makes my life better. So that's why I want it. I like using the compiler to reduce the burden on me.

Formal Methods

Now, I'm going to talk briefly about formal methods. Who here knows what formal methods are? Yes, they're great. They're fun. Formal methods are a way to mathematically reason about a program. We often talk about proving correctness, but it's really hard to know what correct means. We then introduce the idea of a specification. Can we make some set of statements and then mathematically prove that for all possible executions, this program conforms to our specification. It's the exact opposite of fuzzing. In fuzzing, you just try out massive amounts of completely random data until you've tried enough, you've exhausted whatever time you have for fuzzing and you see if there's been any undefined behavior.

Now, the first proof that I remember writing was the Pythagorean Theorem. And when I was first presented with this problem in my geometry class, I thought to myself, "Okay, all right. Well, this is obvious." Like, just did some examples. Much later on, one of my professors told me that when a mathematician says something is obvious, it means that it's obvious it can be proved, but not that the proof itself is obvious. And so the Pythagorean Theorem was the first time that I was confronted with this very powerful concept that I can prove that something is true for numbers that I don't even know exist.

Now, why don't we do this with our programs? Let's go back to that concept of aliasing that I talked about before. Aliasing happens when you have multiple ways to access the same data location and memory. Well, Rust prohibits mutable aliasing. So that gives us the ability to do things like functional purification and all these fancy things that could make formal verification more feasible in Rust. The ownership system, which is how we enforce this no mutable aliasing is the gift that keeps on giving. First, we get memory safety, then we get a little bit of a thread safety, and now it provides a path towards enabling advanced code analysis. There are a few potential approaches. You have equivalents proving. So say you have an existing C++ code base. If you want to replace part of the functionality with Rust, you can prove that your new Rust code performs equivalently, it does equivalent operations to your previous code, which means that you can then go to your bosses and be like, "Hey, look, it does the same and it does it better and we're probably going to have fewer data breaches."

Longer term, there's also options like transpiling into a verifiable language like F* or something. Longer term, our goal is to create a formal semantics for the language. Can we formalize the memory model and allow direct verification of Rust programs? This is kind of like my pie in the sky. I would love that; that is a world I would like to live in. No, unfortunately, verification is not a trivial task and automatic verification is not always possible. In order for the proof of correctness to be useful, we need the specification first. And while we may eliminate some implementation problems, we also have to rely on a trusted computing base. We have to rely that these guarantees that I've stated actually do hold. Without a formal semantics, we don't actually have that proven yet completely. If we transpile, we have to trust that transformation. So we still have this chain of trust in our Rust code bases, but it is much further along in my opinion than other languages.

Unsafe Code

Those of you in the audience who are familiar with Rust, they're probably wondering how I have spoken for almost 50 minutes and not mentioned unsafe code. In fact, I very purposely ignored it on my concurrency slide. And that was on purpose. Sometimes, Rust static guarantees are a bit too restrictive, so we have a built-in escape hatch called unsafe. And what this allows you to do is violate some of the rules. Not all of them, it's not a wild lawless land. What it is, is it's the stamp that indicates that you, the programmer, have verified that Rust guarantees will hold and you then tell the compiler this and it trusts you because you are its friend.

Conclusions

And we do not have time for others. So at the beginning of this hour, I said that you were going to walk out of here with an idea of when to consider integrating Rust code into your work. This is what you can do in unsafe. Rust excels at creating performance parallel code that's free of memory vulnerabilities. It integrates easily into existing code bases. Like other languages, side channel attacks and implementation mistakes can still lead to security issues. There's ongoing work to try to mitigate these in the formal verification and secure code working groups. And if you're interested, please do participate.

When I was first learning how to program, we started out doing the paired programming thing with one person driving and writing the code and the other person reviewing it as it's written. I hated that. I absolutely hated paired programming. But with the borrow checker, I now have a partner, like a paired programming partner. I mean, sometimes I hate the borrow checker, but it's a conversation. Programming is now a conversation between me and my compiler. Unfortunately, the compiler always seems to be right, but the end result here for me is I am a better programmer because of Rust and that's Batman. The other cat earlier was Watson and that's it. Any questions?

Questions and Answers

Participant 1: Regarding the formal methods. You have the C++ equivalence checker. Does it mean every bug in C++ have to be implemented in Rust for the equivalents to give you time up?

Hosfelt: What I mean by like an equivalence checker is there’s a company Gaua, it's in Eugene, Oregon (Somebody is smiling. Is somebody here from them or knows them?) and basically what they can do is at the LLDB level, they can perform formal methods magic. My interest, personally, I think it's fascinating, but I do not have the time to become a formal methods expert. My goal is, how can we allow existing tools and verification things to integrate with Rust? So I don't think I'm the right person to answer your question, but I can try to find out who is.

Participant 2: Would you recommend any particular reading materials to get started with Rust or tutorials?

Hosfelt: Yes, definitely. "Rust by Example" and "The Rust Book." Highly recommend. They are incredibly helpful. And the Rust site has a bunch of them, but that's what I use when I forget things. It's just "The Rust Book" and "Rust by Example."

Moderator: The one thing I will share since we are all in the room, the author of "The Rust Book" is in the back over there. So if you want go bother him ...

Hosfelt: So you can just ask him everything.

See more presentations with transcripts

Recorded at:

Mar 06, 2019

Diane Hosfelt

InfoQ Software Architects' Newsletter