InfoQ Homepage Presentations Systems Programmers Can Have Nice Things

Systems Programmers Can Have Nice Things

View Presentation

Speed:

Download

36:15

Summary

Ryan Levick explores some features that Rust brings to the systems programming world that make life as a system programmer easier.

Bio

Ryan Levick is a principal developer advocate at Microsoft where he champions the adoption of the Rust programming language as a safe alternative to C and C++. Ryan is an active member of the Rust community as a member of many Rust project teams. He is also a Rust educator.

About the conference

QCon Plus is a virtual conference for senior software engineers and architects that covers the trends, best practices, and solutions leveraged by the world's most innovative software organizations.

Transcript

Levick: We're going to be talking about the fact that systems programmers can have nice things. I'm Ryan Levick. I'm a cloud developer advocate at Microsoft. My day in day out job is all about bringing Microsoft and the Rust community together to make our lives as systems programmers at Microsoft, nicer.

What Is Systems Programming?

We have to come to terms with the term systems programming. What do I mean by that? I'm going to try and take a relative conservative approach to the term systems programming, and try to define it in a way that is probably most at home with C or C++ programmers. What do I mean by systems programming? For me, systems programming necessarily means programming without garbage collection, or at least without garbage collection by default. Really, at the end of the day, what this means is that you're working in systems that are resource constrained, or at least you're trying to minimize the amount of resources that are used. This is because you need some high level of control. You're working in a system where you just can't afford to have a lot of resources being used by your software, probably because you're running the software underneath other software. The more resources your system takes up, the less for others to be able to use.

What Systems Programming Isn't

Now we've come to some term for what systems programming might be all about. Let's talk a little bit about what systems programming isn't. For me, systems programming doesn't necessarily mean that you're programming assembly with some syntactic sugar on top. This is a tongue in cheek way of poking at C a little bit. I think this is one of the really great things about the C programming language, a lot of people like to say that C is assembly with a high level syntax on top, and you can really see the assembly when you're writing in C. This isn't inherent to systems programing. It doesn't necessarily mean that when you're doing systems programming, that you're doing manual memory management. That you have to manage the memory yourself by writing a bunch of mallocs and frees. This isn't completely necessary. We'll talk a little bit about a language that allows you to not have to do this without going and using garbage collection.

Of course, also, this doesn't mean that you're tying yourself into the C compilation model, where you have one file means one translation unit, and you translate those into some object file, and then you link them together. This isn't necessary or inherent to systems programming. This is just a way, and not necessarily a bad way, but a way that C and C++ have chosen to model compilation. It's not something that we absolutely necessarily need to have when doing systems programming. There are advantages and disadvantages, just like a lot of things with systems programming.

Systems Programming Isn't Hard

We've seen what systems programming isn't, along with what systems programming is. Let's dive a little bit closer into one thing where I think sometimes is portrayed about systems programming, but isn't maybe necessarily true. That's that systems programming is hard. You might be thinking, I'm a systems programmer, and what I do is work on something that is extremely hard. That may very well be true. Some things that are involved in systems programming are indeed difficult. Those things aren't necessarily inherent to systems programming itself. Just because your domain happens to be particularly difficult doesn't mean that systems programming all up has to be difficult. I'd like to posit that systems programming isn't hard. It can in fact, be quite easy. After all, sometimes we're just fiddling bits, and there's not too much to that. Sometimes there are some things in systems programming when you don't have the ability to use a garbage collector or some nice runtime along with it that does make it a little bit hard.

Let me back that up a little bit and say that systems programming isn't as hard as many think it is. This naturally leads to another question. That question is, can we make our lives as systems programmers easier than they are right now? Can life be better for us as systems programmers? I'm going to wager that the Rust programming language is a peek into one possibility of how our lives as systems programmers can be easier. It's not to say it is the end all be all, the answer to all of our problems. It's certainly a view into a world where our lives as systems programmers have been made easier by a programming language.

How Rust Eliminates Complexity of Systems Programming

How does Rust do this? Rust eliminates a lot of the incidental complexity of systems programming, meaning the things that we talked about that aren't inherent to systems programming, Rust doesn't necessarily require us to focus on those things. A lot of things that we've become conditioned to think because they are part of programming with C or C++, we are conditioned to think that they are inherent to systems programming, Rust as a language can show us that that's not necessarily true. Rust eliminates a lot of the incidental complexity of systems programming, allowing us to focus more on the inherent complexity of systems programming, of which there definitely is some.

The Foundation of Rust - Memory Safety

How does Rust do this? What about Rust allows it to eliminate some of this incidental complexity of systems programming? The first thing that we're going to be talking about is the foundation of Rust, the real core of it, and that is memory safety. Rust is a memory safe language. It's extremely important to know that the foundation we're dealing with, with Rust is the foundation of memory safety. That as systems programmers, we don't have to worry anymore when using the safe subset of Rust, about making sure that our usage of memory is correct from the viewpoint of memory safety properties. Along with this, we get some really great properties with it as well, including data race safety, because Rust has shown us that in order to achieve memory safety, you need to be data race free. These are hand in hand and inherent to one another. We don't have to worry about this as well.

Zero Cost Abstractions

Now that we have the foundation down, let's take a look at some features on top of that, some aspects of Rust that are built on top of that. The first one is an old friend, if you're a C++ developer, the idea of zero cost abstractions. What are zero cost abstractions? As Bjarne Stroustrup, the creator of C++ put it, what you don't use you don't pay for, and further, what you do use, you couldn't hand code any better. This, just like in C++ is true in Rust. Rust allows you to use certain abstractions, where you don't really pay any additional cost to using those abstractions over what you would pay if you had written the underlying abstraction out manually instead. What are some of Rust's zero cost abstractions? We'll take a look at one in detail, but I wanted to list some of them out. The first one that I love talking about is iterators. The ability to iterate over collections of items in a very rich and functional programming like style that allows you to map, and transform your collections in a very fluent way.

The second one is zero-sized types. It's a fantastic feature of Rust that allows you to encode types that have zero runtime size, meaning that when we actually run our program, those types no longer exist. They fall away at compilation time. The really great thing about this is that you can encode a whole myriad of things in these zero-sized types that allow you to provide rich abstractions, that once you go to compile the program, these abstractions completely fall away, and you're left with the underlying thing. You still have some compile time safety built on top of that. A really fun one, along with this is type niches, which is a way of providing higher level types that wrap other types. At the end of the day, the underlying type is still what you see at runtime. We'll see an example of this. The last one that I love mentioning is new types. These are often called wrapper types, which are types that wrap other types and provide additional guarantees around it. Again, just in the same way that type niches fall away at runtime, new types oftentimes also fall away. The theme here is these compile time constructs that allow us to work in a rich, abstract world when we're writing programs that all just completely disappear and have zero runtime trace, when we're actually running our programs.

Example

Let's take a look at an example. Here's some code that I've written. It is two functions. I'm running this inside of Compiler Explorer from Matt Godbolt, who provides a great tool here for looking at the compiled assembly for code. These two functions are exactly equivalent to each other in terms of what they do. If you look at the assembly on the right side of the screen, you'll be able to see that indeed, the assembly is exactly the same. What differs between these two is their compile time guarantees that they provide. The first top function called raw, takes in a raw pointer to a byte and returns back a byte. Of course, it needs to first check, is that raw pointer null or not? If it is, then it will return zero. Otherwise, we can dereference that pointer. We don't have any guarantees, and think about safety guarantees in Rust that we're not getting here. We don't know if that raw pointer is actually valid or anything. At least we're doing the one thing, and we're remembering to check that the pointer is null, and if it is, returning zero.

The second function here is called refr. It's exactly the same thing, except instead of taking a raw pointer, it takes an optional reference to a byte. What's great about this is that the optional reference to a byte is at runtime exactly the same thing as a raw pointer. Option is a great example of type niches, where when we wrap our reference in an option, it knows smartly that the, none variant of the option is exactly equivalent to the null pointer. References cannot be null, so it basically folds that down, and you don't pay the price of having a tagged union that you might have in a naïve version of option. The great thing about option is that we're required to deal with it. When we're looking at our code here, we have to, we can never forget to check whether the reference is null or not, we are required at compile time to do that, unlike with our raw pointer. At the end of the day, we end up with the same exact assembly code.

The Cost of Zero Cost Abstractions

We should mention that there are costs to zero cost abstractions, and in particular, compile times are the big one that often get brought up. This is true. It's not really a barrier to smaller projects, but if you start writing lots of Rust code, you will notice compile times becoming an issue. There's a great talk by Chandler Carruth, here at Cppcon 2019, talking about the fact that there are no zero cost abstractions. This is a really great talk, especially from the C++ perspective to see what costs he talks about for zero cost abstractions that you do pay in C++, and how some of them you pay in Rust as well, while others maybe you don't. Really, it's interesting, because Rust has this lack of baggage of long term features that had to be bolted on to C++, in order to provide some of these things. Because Rust was able to build from first principles, it's able to avoid some of the costs of these zero cost abstractions. Really, the question going forward is, can Rust keep this up in the future? Will Rust be able to keep this up? We're six years in as of this talk, and the answer so far has been yes. We'll see if it continues to do that.

Minimal Cost Abstractions

We've talked about zero cost abstractions. This is something that's totally not unique to Rust, but I want to talk about something that I believe are a little bit more unique to Rust. The first one is minimal cost abstractions. This is an idea in Rust, where we try to make sure that if an abstraction does cost something that we make that cost somewhat obvious. This awareness that we bring to the user of the cost of an abstraction is not this huge amount of boilerplate or something like that. It's simply a tip of the hat that says, ok, I understand that this abstraction is not the low cost default, but rather, I'm opting into the slightly higher cost abstraction here. Some examples of what that is are explicit cloning. Unlike in C++, where C++ has copy semantics by default, where values are usually copied, Rust moves things by default. Interestingly, C++ has a whole bunch of optimizations that it does to allied certain copies, because if it copied all the time, then things would get very expensive. It's interesting that this benign choice of having copy semantics has really outstanding implications of how easy it is to learn the language, how easy it is to control the performance of the language. Rust, I think does a really great job of just starting with the low cost default, and then giving a tip of the hat when we need to opt into a cost.

Of course, the other example is the dyn keyword, which is the keyword that indicates that something is dynamic dispatch versus static dispatch. We'll take a look at what this is with this example here. Here, I have an example running in the Rust playground, which is a simple function called perform. It takes some HTTP client wrapped in a reference counted smart pointer. Really, the first thing that you need to know is when you look at it, you can tell from the dyn keyword, that this client will be using dynamic dispatch when we call methods on it. That is a little tip of the hat letting us know when you call a method on here, it's going to be dynamic dispatch, as opposed to the lower cost default of static dispatch.

The next thing that you see here is the explicit call to clone. This is because on every iteration that we do here, we have to copy our client and we have to copy our request. In Rust, we need to explicitly call clone, so that again we're giving a little bit of a tip of the hat and saying that we know that we will pay the cost of actually doing the copy here. Because presumably copying the request and copying the client are non-trivial costs that we'll have to pay. This is where Rust really does a great job of giving a tip of the hat and saying, "This doesn't cost. This isn't free. This is nonzero," so we will at least acknowledge that fact.

Negative Cost Abstractions

The next thing I want to talk about beyond these minimal cost abstractions is actually negative cost abstractions. Abstractions, basically, where you provide enough information to the optimizer to make code even faster than you could even possibly write yourself. This is not something where, ok, if I wrote the code myself, I would be able to make code that's just as fast. This is where the abstraction allows you to write code that's possibly even faster than code you could write yourself. A great example of this is borrowing. We've talked about borrowing before. The one thing that makes it a real negative cost abstraction here is the fact that without the Rust borrow checker built in, some things that are trivial to write in Rust would be very hard to write in a correct way in C++, without having something checking you at compile time. Because we have this very strict semantics of borrowing, we're able to write this in a way where the optimizer can peek right through and say, "I know a whole bunch of things are not possible here, and I'm going to just optimize everything out of this." Where in another language, because the optimizer doesn't have that high level information, it's not able to optimize quite as aggressively. This is something that Rust provides that other systems' languages, maybe don't get quite as close to providing.

Rich Abstraction

Beyond this is rich abstraction, where we provide simple yet powerful tools for abstraction that the user can hook into. Examples of this are our enums, abstract data types. Derive macros, we'll take a look at an example of this. Providing a way for users to really hook into even more powerful abstractions. Here's an example of serde, the serialize, deserialize library here. Just with this one line right here, we are telling serde that we want to derive how we will serialize and deserialize this point struct. This is straight out of the example from serde. Down below you can see with that we've generated enough code to be able to serialize and deserialize this point struct from JSON. All we had to do was write one simple line and all the defaults fall into place. This is a really great example of something very simple, very easy to understand, and allows you to have rich abstractions provided to you.

Good Error Messages

Those wrap up to the abstractions that we have. Really now we're getting into the things that tie everything together that make the whole thing work. The first one here is good error messages. Here's a great tweet from Julia Evans. This is from her saying another great Rust compiler error message, where Julia here is trying to compare two numbers together but she has to convert one number to another type. Because of quirks of the parser, or whatever, it's interpreting this as potentially a generic argument. The Rust compiler does a really great job of saying this is exactly what I'm doing, I'm interpreting this as a generic argument. I'm not interpreting it as a comparison. If you want to interpret it as a comparison, then go ahead and write your code this way instead. Of course, IDEs then can easily take advantage of that and provide automatic fixes for this as well. This is crucial when you have a lot of compile time checks, that you make sure that those compile time checks come with really great error messages, so that they are obvious and are easy to understand, and even machine applicable so that users don't even have to type code to fix their errors.

Tooling

Next, coming with this is great tooling that you have in Rust. This is really tooling that just works. This of course all comes down to the cargo build tool, which comes with Rust. You want tooling that really out of the box just does what it does. What's really great about this is not that cargo is wonderful and that's it, it really allows for whole new workloads to come from the fact that cargo is built in and it just works. The first thing is that it has this emerging complexity property, so if you're doing something simple, the incantation that you need is also simple. You don't have to spend 20 years mastering a system to be able to compile some simple code. By default, everything is nice and easy. As you need to have more complexity in your workloads, cargo allows for that iteratively on top. The second thing is that cargo, and this is the extremely important one, provides a convention on which all Rust projects run on this convention. There are certain things that you can just assume about a Rust project because you have this convention that comes from the use of cargo.

Let's take a look at an example. Here I am in a project that I want to build called person. I know even if this is not my project, how to build it, cargo build --release, good to go. It will just work. If I want to add a dependency into this, I can go ahead and add nom here, which is a great parser combinator library. It is one line, and I know exactly that I'm up to date here. How do I go ahead and build my dependencies? I just run cargo build --release again, and it downloads the dependencies. Because all of my dependencies follow this convention of using cargo, I can go ahead and easily download them and compile them. It all just works out from there. This is really great even in systems where you're not using cargo as your main build tool, because you're still following these conventions in your subdirectories, which use Rust, and you can still benefit from a lot of the built-in tooling that comes with cargo. Of course, this extends to tests as well. Running tests is always cargo test, and you're good to go.

Documentation

Extended from that is documentation where you have documentation with a built-in tool, cargo doc, and it just works. Here's the standard library that's documented. It's all in a forum that's exactly like other projects. Here's serde the project we used for serialization and deserialization before. It looks just like the standard library documentation, because it all follows the same form. There's no needing to get used to different types of documentation with what the documentation might look like. It's all going to be in the same form as documentation for other projects.

Community

The last thing that I want to talk about is community. I wanted to end on this because this is the most important, in my opinion. A common barrier to systems programmers and especially newcomers to systems programming is this idea of RTFM. That if you want to learn how a system works, get into a code base, better your skills as a systems programmer, all you need to do is read the manual, and however long that takes, you'll crash through it. Eventually, you'll get and understand how to do things, or not. You'll leave and who cares, good riddance. This is, I think, a terrible way to treat a programming language community.

Really, this comes from a place of, I've suffered, because learning systems programming traditionally has been very hard to do, so you must now suffer too. This is something that I just think is unacceptable. In many ways, the Rust community has just stricken this out and said, no, we do not want to act in this way. We want to act in a way where we're welcoming. We allow people to come and ask their questions, and learn to be systems programmers in a safe environment in the community, with a safe language that allows you to know when you've made a mistake, and provides great error messages so you know how to fix it. What this really means is you can have a productive and highly technical community that is also welcoming and fun. This is extremely important.

Conclusion

Rust is not the end all be all, but it does show we can do better than we have done in the past. I want this to be a new minimum of what systems programmers should expect. If you're not interested, that's fine, but at least allow us to take Rust as a new minimum of what we as systems programmers expect from our programming language tools, and community.

Questions and Answers

Shamrell-Harrington: I remember when I was coming up through university and other things, there was very much that attitude of I suffered horribly when I learned this, so you must too. One of my favorite things about the Rust community is that that's very much not the case.

Levick: There's often this feeling I get when reading documentation geared toward systems programmers is the desire to want to make it technically correct but as difficult to understand as possible. When you can write things in just the same way with the same technical correct information, but in a way that is more easily digestible and more friendly to people so that they can more easily understand it, so they don't have to work so hard to understand it. Why do we make people suffer more than they need to in order to be systems programmers? I just don't understand.

Shamrell-Harrington: You showed us a few examples of tools, including cargo and Rust Doc, what other tools do you recommend for someone who's both new to Rust programming and new to systems programming?

Levick: There's a whole myriad of them. There's too many to really enumerate out fully. The great thing about the Rust tooling as well, knowing that you can rely on cargo as your tool means that you can extend cargo through these tools that you create, in a simple convention that cargo knows if there's a binary on your system with cargo -something, then it knows that you can do cargo space, something, and it will automatically call that binary on their system. This opens up a whole myriad of different things. The canonical example of that is Clippy, which is a linting tool, which is really great and can provide, for a learner, a really great way of knowing, am I doing things in an idiomatic way that maybe are technically correct, but are things that Rust programmers wouldn't do? That's a really great one.

There's a whole myriad of ones depending on what background you are coming from that you might want to have. A great example of that is cargo-edit, which allows you to easily edit your Cargo.toml manifest file. This is something that JavaScript programmers, for instance, just take for granted, because npm works that way. Cargo doesn't work that way by default, but it's as simple as cargo install cargo-edit, and then you have that functionality. That's just one example of a million, it seems like, where if the tool doesn't work exactly how you like to because the conventions are so strong, it's very easy to plug in to that and get the functionality that you want.

Shamrell-Harrington: In what areas do you see Rust replacing C++?

Levick: I think the most obvious one is in security critical software. Anything where if there is a security bug due to memory safety, it would be a horrible thing. This is why at Microsoft, Rust is quite interesting because we run a public cloud. If you allow for things like the guest getting access to a host through basically breaking down the VM isolation that you have running on one of Azure's computers, that's quite dangerous. That can lead to a lot of damage. On the other hand, for game programming, there's a little bit less of a risk there.

If there's some bug inside of your game, you're only at one time potentially affecting one customer. The stakes are a little bit smaller. When it comes to adoption of programming languages, it's often not good enough to have incremental tools or incremental improvements in your tools, you need something that's fundamentally better. I think the safety properties of Rust are the thing that's fundamentally better about Rust than the status quo.

My talk was a lot about the nice things, and how much fun and how good it makes you feel to write Rust code. That's all well and good for us as individual programmers. For the industry, the important thing is fundamentally changing the way that we do systems programming and addressing the idea of unsafe systems programming. If we can do that, then we're actually going to have real fundamental change in the industry. That's a long-winded way of saying, I think that that's where Rust will be dominant going forward. At the end of the day, my hope is that it will reach into many other places as well.

Shamrell-Harrington: Do you use Clippy?

Levick: I have used Clippy before. I can't say that I'm a hardcore Clippy user or anything like that. Definitely Clippy is something that I run in CI often just to make sure that I'm not doing something that others would think is not the right thing to do. I know that there are people that swear by Clippy. I don't know if I'm one of those. Definitely Clippy is a good tool to have in your toolbox, for sure.

Shamrell-Harrington: Where do you see systems programming going next? Rust is the start of making it more humane in some way, where do you see it going?

Levick: I see a little bit of a flattening out of systems programming and application programming into not being such separate domains anymore. I think traditionally, the fact that systems programming was this guarded tower that only special people could participate in meant that systems programmers were systems programmers and application programmers were application programmers, and they shall never mix. That if you program apps, you should never be a systems programmer, and vice versa. My hope is that that just completely breaks down in the future. If you have a fundamentally safe programming language, then that means that application programmers can very easily jump down into the systems realm when they need it, when they need performance, or control, or whatever they need.

Way back in the day, back in my previous life, I was a Rails developer. It was always this mystical thing when you heard about this team going down into C to write some piece of their code base in C because they weren't getting the performance they needed out of Ruby. That shouldn't be mystical. That should be trivial. That should be something very easy for things to do. Maybe you write Ruby, because you feel like it's faster to get things done. Your team feels very agile in the language. You shouldn't fear going down into your systems programming language, you should be able to do it when the time is right and not have to worry that you've opened up your entire company to security vulnerabilities, or a whole new class of bugs in terms of segfaults and things like that. We just don't need, as application developers, to worry about that. I really see the realm of systems programming breaking down and opening up to a whole other type of programmer, namely the application developer.

Shamrell-Harrington: There's a quote from Ashley Williams, who's a Rust core team member, and the Rust Foundation Interim Executive Director. It was that Rust product is not a programming language or a compiler, Rust product is the experience of being a Rust developer.

Levick: Yes, absolutely. Really, the language is one tool. There are many aspects to that, some technical and some social. Even the lone programmers in their basements still have to present that code to the world. Coding is a social thing, and our tools need to be able to function in that way, and function for whoever wants to participate in.

See more presentations with transcripts

Recorded at:

Apr 15, 2022

Ryan Levick

InfoQ Software Architects' Newsletter