BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations System Level Programming Languages Panel

System Level Programming Languages Panel

Bookmarks
39:00

Summary

The panelists discuss the operating system they are building on Rust, and where they'd like to see both Oxide and the Rust language go in the next five years.

Bio

Bryan Cantrill is Co-Creator DTrace, Co-Founder Fishworks Sun Microsystems & Co-founder and CTO @oxidecomputer. Laura Abbott is Engineer @oxidecomputer. Cliff Biffle is Engineer @oxidecomputer.

About the conference

QCon Plus is a virtual conference for senior software engineers and architects that covers the trends, best practices, and solutions leveraged by the world's most innovative software organizations.

Transcript

Shamrell-Harrington: Welcome to a panel on Oxide Computing, which is a company that is building on Rust. I'm a principal engineer at Microsoft. I'm also a board director on the board of directors for the Rust Foundation, and I lead editing on, This Week in Rust.

Cantrill: I'm Bryan Cantrill, co-founder and CTO of Oxide Computer Company.

Abbott: I'm Laura. I am an engineer at Oxide Computer. A lot of my background has been with the Linux kernel. I've been at Oxide since January 2020. I'm very excited to be able to talk about my experience with Rust, and everything we're doing at Oxide.

Biffle: I'm Cliff Biffle. I'm an engineer at Oxide. I've traded 20 years of doing silly things with C for now doing slightly less silly things with Rust.

Why Oxide Computing Was Founded, and Why Rust

Shamrell-Harrington: Can you tell us, what inspired the founding of Oxide Computing? Then, what inspired you to build so much on Rust?

Cantrill: Yes, like many companies, the foundation of Oxide was born out of frustration that the thing that we wanted in the world wasn't being built. It felt very frustrating to Steve, Jessie, to me that the state of the art of the actual computer had not advanced that much from the perspective of those that needed to run their own compute in the data center. Indeed, those folks were being told that they didn't exist, that everything was either going to the public cloud, or was going away. What we knew from our own experiences running a private cloud is that the folks were very much running on premises. Those folks felt pretty ignored by the market and the products really weren't very good.

This is going to sound ridiculously trite. I knew we wanted to go do something together. We were beginning to think about what that old, I don't know if it is mistreated Eleanor Roosevelt, I've always treated Eleanor Roosevelt, what would you do if you weren't afraid of failing? The thing that we would do if we weren't afraid of failing was start a computer company actually. That was the germ of the idea. As we talked to other technologists, we learned that a lot of other people saw the same problem that we saw. At the same time, I was beginning to fall in love with Rust. The three of us come from slightly different backgrounds. Laura and I actually are most similar in that we are historical C programmers. Cliff is coming from the perspective of C++ historically. I was discovering Rust out of something of desperation. It was like Rust or bust for me. I didn't really know what was going to happen.

I had experimented in other languages and those weren't going to be a fit for the software that I wanted to write. If it was not going to be Rust, I was going to have to find a way to componentize C better. It was like question mark, if Rust didn't work out. As I got into Rust, it was really exciting actually. I had a blog entry about falling in love with Rust. To the point that when we started the company we wanted to have a tip of the hat in the name. Oxide is very much named because of what we saw happening with Rust.

That said, it was a little bit assertive in that I didn't know Rust all that well. We'd been on a couple of dates and we'd gotten along, and things seemed to be going well. Now we're really talking about building a company around Rust. Going in with a bit of a question like, would Rust be a fit? That's when Laura and Cliff both joined Oxide very early, and discovered pretty early on that Rust is going to be an even better fit than we thought it was going to be in even more places than we thought it was going to be. That was an exciting realization. I think it was really helpful to have Cliff in particular. Laura, I don't know from your perspective, my perspective is we were newer to Rust and Cliff was coming in with a lot of Rust experience, and had been already in certain places, and was really able to show us that this is actually very viable.

Experience with Rust

Shamrell-Harrington: Can you tell us about your experience with Rust as you were going into Oxide and how your understanding may have evolved as a result of using it?

Biffle: I used to do firmware for a balloon internet project about 10 years ago. At the time we started in C and we had moved into C++ because we needed the ability to define some of our own abstractions, which was something C is a little weak at. Our move to C++ was always grudging. We actually were aware of Rust from 2013, or so on. It never quite seemed right until 2015, 2016 around the time of 1.0, it started to look like an actual contender with a lot of really thoughtful changes that for those of us on a bare metal, had real-time heapless programming environment were incredibly valuable. I had snuck Rust into some previous projects in areas where we needed high performance, high determinism computing. That had been very successful. The interesting thing was teaching Rust to new people and watching how different people's language backgrounds affected it. Honestly, a lot of this got easier at the 2018 edition, which made a lot of improvements to how learnable and understandable the language was. Teaching the Rust language today is so much easier than in 2015 because they've rounded off all of these horrible corners and improved the ergonomics just so much. When I got to Oxide, it was exciting not to have to start with the hard sell.

Using Rust in Prod at Oxide

Shamrell-Harrington: Can you tell us what it was like to come to Oxide, and start using Rust to write production stuff?

Abbott: Oxide was the first time I'd ever written any serious Rust. I'd played around with it, and I'd been aware of Rust for a very long time. I'd also seen it evolve. I think for the most part that the biggest thing for me is just to get comfortable really, to learn how to work with Rust and be able to see exactly what it can do for me to solve the right kinds of problems. If you're starting out, it does look fairly familiar to see. A lot of the constructs will be similar. If you've come from any programming language outside of C as well, you also see some very familiar things. I had somewhat of a background in programming languages with a lot of respect there. I felt like I was coming into a C background with some of my favorite PL features.

It was really nice to be able to pick up and just be able to get more comfortable. Also, it gave me the confidence to be able to write things and not be able to make various mistakes I would be making in C, especially with respect to memory safety, for example. I spent so much of my career in C then tracking down these bugs in C just to know that Rust can eliminate a lot of these types of errors, but with minimal performance overhead. It's been fantastic to be able to work with. There's definitely been a learning curve but I think that comes with any language. I think it's been great to have everyone at Oxide be able to answer questions that's helped me also be able to write more idiomatic Rust as well, to be able to get to know some of the nitty-gritty stuff. I'm really becoming a better Rust programmer at the same time.

Shamrell-Harrington: You are definitely a Rustacean. Once you write your first line of code, you become a Rustacean in my book.

Oxide's Hubris Platform

I understand you're writing an operating system in Rust for the hardware you all are building. Can you tell me more about that?

Biffle: Part of what attracted me to Oxide initially was that their firmware effort at the time I joined was based on Tock. We worked with Tock for several months, and increasingly it became apparent that what we needed was weird, and our application didn't line up super well with what Tock provided, which is a bunch of nice stuff, but not the stuff we wanted. We made the difficult decision last summer to roll our own, and in recognition of how that sounds we called the platform Hubris. Have been building our firmware on top of it since then. It's a fairly simple all Rust, microcontroller protected memory operating system.

Abbott: I think also, it's sometimes pretty rare to be able to write stuff from first principles. We've gotten really a chance to be able to build up and also be able to, I think, build the abstractions, we as lower level software developers really want to see in the world in terms of being able to have stuff that's nice to work with. Being able to build everything up in terms of also being able to figure out, how exactly do we write safe drivers? For example, that take advantage of Rust features to be able to have access to the hardware that let us do what we want to do, but also, again, be able to take advantage of all the great features that Rust provides and everything.

Part of this has also been a lot of work on trying to make sure this is also debuggable, because I think we all have stories about getting stuck debugging one of these things. Because as much as we all talk about Rust and all the joy it provides, it will still not take care of some classes of bugs, of course. We still have to be able to make sure we debug. We've spent a long time. Especially, Bryan has been really pushing to be able to make sure we have a good way to be able to debug Hubris with Humility, as Bryan called his debugging tool, to be able to help to get things going. I think it's been a good experience to hopefully be able to really get things going, and be able to, again, see the change you want to be in the world.

Shamrell-Harrington: I understand there are some principles of operation for Oxide, which are very important. Can you tell us more about those?

Cantrill: Yes, I do think. Just to touch up on Hubris, because I just can't resist to try to be here. I've had so much fun building with these two. It has been such a homecoming for me. I worked for a company called QNX back in the day, microkernel based operating system. Cliff won't let me say Hubris is a microkernel. He's calling it an email kernel. It feels like it's more appropriate for Ozymandias and Hubris. One of the things that I particularly love about Hubris, certainly for me, one of the wisdom that Cliff brought was what things in Rust not to use. For example, Tock is an async heavy system. I think for us as relative newcomers to Rust, it's like, interesting. With Cliff coming in with more wisdom, and more miles on the tires from a Rust perspective, it's like, async, there are times to use it. There are reasons to use it. It also complicates a lot of the system. Hubris has multiple tasks that are broadly synchronous, and we can add asynchrony to that system when and where we need it. It's much more often a synchrony as opposed to forced asynchrony. That's been great.

Cliff was very adamant about using the memory protection unit. Obviously, we're pro memory protection. Cliff, I don't feel you get to have an argument with anybody about the memory protection unit. You might think to yourself, why would you need a memory protection? Why would you need to enable the memory protection unit in a memory safe language? I think one of the things that is an important eye opener about any system is your stack usage is always fundamentally unsafe at some level. Even in an all safe system, if your stack grows without bounds, if you don't hit a memory protection boundary, you will hit somebody else. You will plow into somebody else and you will corrupt your system. Indeed, where you overflow your stack and corrupt someone else is some of the most pernicious corruption to find. Even in Hubris, some of the gnarliest bugs we've had have been from that exactly. Again, not that we weren't converts to begin with, but anyone developing their own Rust based system who thinks that they can turn off the hardware memory protection unit is doing so at their peril.

Oxide's Principles of Operation

On the principles and values, very important to us. Looking to start a company, one of the things, and Laura talks about this too, in terms of being the difference that we want to see in the world, we wanted to be the difference that we wanted to see in the world in terms of the way a company engages its employees. The way a company engages its customers. The way a company engages its community. The way a company engages its broader community. We feel that we can be a model in that regard. I think that part of that is being very upfront about what our principles are. To be clear, principles are unequivocal. Honesty, integrity, decency, those are our principles. Principles can't be taken to a fault. You can't have someone who's like, just too much integrity in this person. They're their absolute. Whereas values are things that are tradeoffs. We have got 15 values at Oxide, which feels like a lot, but it feels like it's the coverage that we needed: courage, candor, curiosity, diversity, empathy, humor, optimism, resilience, rigor, responsibility, teamwork, thriftiness, transparency, urgency, and versatility.

What we have found is, one, it has served to attract people to the company. When Laura and Cliff both came into the company, those values were important to them. The three of us share values. I think, almost just as importantly, when we diverge a bit, when we've got a different perspective, we can go back to those values, and help reason about some of those differences. From our perspective, this has been essential for the company. I can't imagine anyone starting any endeavor without being upfront about those founding principles and values.

Oxide's Values

Shamrell-Harrington: I do want to get Laura and Cliff's perspective on the values.

Abbott: I was admittedly a little bit skeptical when I first heard about the values. A lot of companies out there say they have these types of values. I think more than anything, Oxide, I think has really been learning along the way about how exactly to hold itself accountable to those values and be able to go along the way. I really say this is a learning process, just because I think these are things we strive to be for. I think part of along the way is figuring out how to make sure we are going to go up, especially as we have grown as a company. When I joined Oxide, I think there were maybe not even 10 people and now we're up to 30 or so. We certainly have a lot of people. I think part of figuring out what we do along the way is exactly how we have to go. I think more than anything, I think that transparency has really spoken to me about as we've made changes, as we've tried to do everything, I think that has really helped. One of the values that's always struck me is in terms of being able to keep things going, so when we inevitably do make mistakes, we have a way to be able to find a path forward.

Biffle: I feel like one of the things that I've experienced, and I hear this a lot from applicants is that we've all got a corporate PTSD of various kinds, depending on where we've worked previously, or the stories we've heard. A lot of the process in evaluating a company is trying to figure out how they're fronting, how they're lying to you about their public image, and which of these things they espouse on their website that they just made up because they thought they sounded cool. They don't reflect day to day operations. We try to put it really upfront with people that like, no, really, we take this list of things seriously. There's a lot of things on the list, but they all stand alone. It really does produce a different environment both internally, but also when we're trying to bring people in, and we can say "No, really, here's the list. You can read the things we've said about it. We're curious what you have to say about it, and let's talk. Let's actually engage on this." It does produce a different kind of environment. It attracts very thoughtful people, and they're all a joy to work with.

Shamrell-Harrington: That's very Rust, thoughtful, technical excellence.

Features of Rust That Make Writing Code Difficult, at the Lowest Level

What are some of the features of Rust that you've found that make writing code at the lowest level still difficult at times? Examples I've seen, the standard alloc lib's panic on out of memory exceptions, having control whether certain code is pageable or not.

Abbott: I like to start with just for my thing about learning how to do unsafe Rust, because I think probably the big thing, especially when you're writing low level code, is that I think programmers, especially for when we're here on safety, are like, "I can never use unsafe," which is not actually true. The big thing about unsafe is that what it actually means to be doing things that are memory unsafe with respect to the guarantees that Rust can provide. That's probably the trickiest thing I think is for me has been learning how exactly to use unsafe because things like especially accessing hardware is somewhat inherently unsafe in some respects. You have to be able to learn how to do that with respect. I think that's probably been the biggest thing for me is especially learning how to write unsafe Rust as a Rust way as opposed to a C programmer would write Rust, which is not exactly the safest thing. Also learning exactly how to build these abstractions in such a way that they are narrow.

One nice thing about writing C is that it doesn't have a great boundary in terms of module boundaries, in terms of say, this code goes here and this code goes here, whereas Rust I think is a lot tighter in terms of being able to contain things. I think, really being able to say, this code in particular is the stuff that's unsafe, and what else can you derive from there. That's been one of the biggest challenges I found. I think things have certainly gotten better. I think especially as far as where Rust needs to go, I think making sure that it's somewhat easier and better specified about how to actually do some of that for some of this hardware movement is definitely a place that I think is getting better in area of growth.

Biffle: I think to start with the specific one that the thoughtful question poser raised. The standard library panicking on memory exhaustion isn't really a problem for us, because we target libcore, which is a thoughtfully separated subset of the standard library that doesn't assume the existence of a heap or threads or other platform dependent things. Our bare metal code, and I think most bare metal code targets either libcore, which has no heap and therefore cannot panic on memory exhaustion, or a combination of libcore and liballoc. The facilities in liballoc are a little more flexible in terms of allowing you to specify, particularly if you're using unstable features, memory allocators that can fail.

In our case, we don't dynamically allocate memory because that way lies unreliable software. The features that make Rust hardest at a very low level are the areas where Rust is bleeding edge in terms of programming language research. Even unsafe Rust is way more specifically defined than C, for example. Even in unsafe Rust, if you have two mutable references, they are not allowed to alias the same data structure. If you violate that, weird things are going to happen and you can't violate that in safe Rust. We write a lot of unsafe Rust so we can totally violate that. The issues around initialized memory, it's pretty common in bare metal things to be able to say like, "I created this Ethernet DMA buffer over here and I'm going to create a pointer to it because that's how you get access to it." Except that you forgot to initialize the memory and Rust requires that things be initialized before you reference them.

The compiler will make assumptions, like if you load a byte from that and say, is this zero? The compiler will say a constant yes, because it's initialized to zero, you said right there. If you forgot to initialize it, your software is going to misbehave. There are a small set of patterns you learn to work around these, and these are also classes of behaviors that in C lead our software to misbehave. When you're writing unsafe Rust, you have to do the job of keeping track of all of these things quite explicitly. I think that's what's posed a lot of overhead for particularly new systems programmers in Rust.

Cantrill: The presence of unsafe, though, also allows you to write. Between unsafe and asm! which I feel is another killer Rust feature, the asm! macro is really nice. It's really in any operating system, but in much low level software, you're going to have components that are actually in assembly. That has always been rocky in every language, in part because the machines themselves are rocky. When you're trying to put an abstraction on something that really is not amenable to cross platform abstraction, because it's not cross platform, it's the actual machine. The asm! macro has been really delightful. Between asm! and unsafe, you can write effectively any code in Rust. If you can write it in C, you can write it in Rust, and that gives you a starting point where you can then begin to look at, now what am I doing? How can I pull that into more pneumatic Rust? How can I pull that into safer Rust?

This is where I think Laura and I have both benefited from Cliff's wisdom where we would start off with something, it's like, this is great. This works. I think actually I've noticed that you've got some soundness issues here, where this code as you've written it is fine but it could be extended in a way that's unsound because of your use of unsafe here. Let's actually brainstorm a way to make this safer, to prune down the unsafety. That gives you this really nice iterative path, because I think that one of the challenges of Rust that I feel that all Rust programmers have is that Rust challenges you to do it a better way, which is great. The downside of that is as a Rust programmer, you're like, is there a better way to be doing this right now? Am I doing this the best way? There's the sense of, there is a better, cleaner way to do this that I am not finding. That can slow you down a little bit.

Again, it's been hugely helpful, to have Cliff tell us, "I have blessed you. This is a clean way to do it," which feels liberating. I think you need that with Rust a little bit. It's a strength of Rust, but it's also a peril. Part of the reason you constantly search for that is because there often is a better way to do it. When you discover the better way to do it, you're like, that's so clean. That's so nice. It's so tight. It looks just like, why would you think of any other way to write it? It's so satisfying. It's such an endorphin rush as a programmer to be able to have that beautiful, tight, robust, safe code. At the same time, you have to hit this balancing act of knowing, no, actually, there's going to have to be some unsafety here. This is a good starting point, and we can evolve it into something that's better over time.

Biffle: One of the key decisions in programming language ergonomics, which is a thing that happens whether people think about it or not, is languages encourage people to do certain things and not to do other things. This tension between doing it the obvious way and doing it the tight, elegant way exists in C and in assembly language. This is a universal tension. The unique thing about Rust is that there is a bunch of decisions in the language design that nudge you to be better, that nudge you up the rigor curve. For example, the fact that writing unsafe code simply takes more letters to type than safe code, because you have to write the word unsafe at the very least, that's nudging you because nobody likes to type more. That's nudging you toward maybe not doing that. It's an interesting thing to experience.

Stack Overflow Checking

Shamrell-Harrington: Am I correct in inferring from Bryan's comments earlier that you all don't have Stack Overflow checking turned on?

Biffle: No, quite the contrary. Stack Overflows are quickly caught by the memory protection unit. We're using MV7m primarily, which is an architecture for microcontrollers that doesn't have built-in stack protection. We do also have some ARMv8-M processors, which have a native stack overflow protection. In both cases, there's a memory protection unit that you can configure to cache these sort of things. If I can make a brief other shout out to the other thing you need to do with the memory protection unit to make Rust correct, you need to intercept null pointer dereferences. This is a problem C has had in embedded context too, because frequently compilers in their never ending quest to do what we say but better, will see a null pointer dereference and say, it's a null pointer dereference. I don't need to check that because it's going to false, because that's what happens on Unix. It doesn't happen in most firmware. Check for stack overflows and restrict access to address zero, and you're pretty much home free.

The Limitations of Rust

Shamrell-Harrington: What things do you not like about Rust? What frustrations do you experience?

Cantrill: Honestly, very few. I've always felt that you want to talk to someone not on their first day of learning a programming language, but on their 1000th day. Where do they feel about the 1000th today? For me, Rust on the 1000th day remains incredibly uplifting. I'm still discovering great things and the ability to make things better. Broadly, it is very good. I will say, one frustration I have, and this may just be an ideological difference, I definitely believe in consistent formatting.

I admire the approach that Rust format has taken, which is, we are going to reduce your program to its atomic particles. Then we're going to reassemble those atomic particles in a way that passes the formatter, which is great, and it works almost all the time. It's amazing that it works so frequently. The fact that this is my biggest issue shows how few big issues there are because this is a pretty small issue, where we want to structure code that fits the format rules, but is not exactly what it wants to see. It's very hard to tell the thing, this is actually fitting the rules that you've outlined. You just don't like it. You want to do it differently.

The specific example is where you've got functions that are returning things that other functions are going to operate on, and you have these long chain of functions. This gets into the details of the actual machine model on Cortex-M. We want to actually group these things by functional unit. This thing is like, actually, I want each of those on a separate line, and be like, yes, exactly. It's like, big piece of vertical space. That's the level of issue that you're dealing with. That is actually quite literally my biggest frustration with Rust. That's a very small frustration. We can definitely live.

Abbott: I'm going to simultaneously say Rust has fantastic compiler error messages. There have been a lot of work put in to be able to help you decode some of these. At the same time, especially as you start to write larger projects and write larger stuff, I think sometimes figuring out how to decode some of the error messages can be cryptic, especially when it comes to building larger libraries. This is something you run into especially with C. I think in particular, some of the Rust type things remind me of other obtuse errors from things like that. I think learning how to debug some of the Rust type of thing.

I think also the thorniest edges for me about Rust, tends to be around some of the tooling, which is not that it's not great, but I think it's just that errors I think we tend to run into, and especially in terms of trying to build like the thing. It's another place to be able to learn how to do things, especially how cargo interacts with Rust. Cargo is a good build system, but at the same time it has its own quirks about how exactly to do things. I think especially for what we've been building, cross compiling, for example, that's been a little bit of errors we've had to explore and learn about exactly for being able to build things. I think in particular we've had to find some rough edges about being able to split between host tools and tools that are supposed to go on target, things like that.

Rust, if you're just being able to run it on your x86 desktop machine, I think there's been a lot of work, and I think that's pretty optimized for that. I think for a lot of what we've been doing with the embedded and cross compiling, I think there have been some rough edges. There's certainly a lot of areas to be able to grow and fix things up there.

Biffle: Debug symbol generation in async code could be dramatically improved. If I've got a future that's not currently being pulled, I can't print its stack trace. That's annoying. The technical pieces are there, but we haven't put them together yet. There are some things missing from libcore that I need to write useful software, like floating point transcendental operations are missing from core. They're in standard, because something-something, how LLVM models intrinsic something-something. That can be a little bit style crampy. We don't use a lot of floating point at Oxide, but my personal projects definitely do. There's a bunch of language features that I think are important that have been stuck in unstable for longer than I would like. Granted, I'm not getting in and doing the work to stabilize them. I don't have a lot of room to talk. Like the never type. I would like to have the never type on stable. I would like the compiler to be faster at the same time, I get that that's a really hard problem.

There are growth seams in the language that are sometimes a little more obvious than I'd like, for example, our notion of how to do errors has evolved as the language has evolved, not to the same degree as some other languages, but like the error trait. Is it right? Is it not? Do you use a crate to talk about errors, or do you use the standard library? You get different advice from different people. Cargo is limited in a lot of ways. It's nice that the language comes with a build system. I've got a blog entry on why that's an important thing. It's also limited. Finally, the embedded ecosystem is almost entirely written in a way that assumes that you're running in privilege mode with memory protection off. This can cause the code to be straight up incorrect if you violate either of those assumptions, which is not a Rust language thing so much, but you can't really use a language without its libraries. I think that's fair game.

Cantrill: It's a good question to ask because I think asking people about the imperfections in the technology really assesses whether someone has gone deep enough into something to really use it. I would also caution that it's a huge mistake to weaponize these imperfections, in that these are relatively small imperfections in a system that is broadly a fit for many different kinds of use cases.

Biffle: Clearly, we've looked at these imperfections and said, yes, we're going to use this tool. Like my list of imperfections for any programming language is at least this long. These are imperfections that I can deal with personally. You may disagree. Maybe one of these really rubs you the wrong way.

Abbott: More than anything, we also know that there's an abundant Rust community that is out there and really working to try and get things going. Some of these will also be problems we see now but we also realize that as things go down the line, we expect things to be able to continue to grow and build for everything.

Conclusion

Cantrill: If you are discovering Rust and you're excited like, "I would love to contribute," view these as opportunities to go to the coalface on all of these. I love Cliff's example of getting the DWARF information on futures and being able to actually properly generate a stack trace. That's a gritty problem, but like a lot of problems in Rust that the Rust community has, Rust community has tackled a lot of gritty problems. It's a huge tribute to the Rust community. These are all the problems and imperfections that we have mentioned. Although there's a good opportunity to make a real difference and to certainly ingratiate yourselves to the Oxide swag, to whomever solves these problems, for sure.

 

See more presentations with transcripts

 

Recorded at:

Apr 28, 2022

BT