InfoQ Homepage Presentations No Moore Left to Give: Enterprise Computing after Moore's Law

No Moore Left to Give: Enterprise Computing after Moore's Law

View Presentation

Speed:

Download

40:39

Summary

Bryan Cantrill talks about Moore's Law, which after years of defying predictions of its demise, is now indisputably dying. But what does the end of Moore's Law mean for practitioners of enterprise computing? He explores potential ramifications, including not only the exotic world of emerging substrates, but much more mundane notions of scale, durability, economics and (yes!) accounting.

Bio

Bryan Cantrill is the CTO at Joyent, where he oversees worldwide development of the SmartOS and SmartDataCenter platforms. Prior to joining Joyent, he served as a Distinguished Engineer at Sun Microsystems, where he spent over a decade working on system software, from the guts of the kernel to client-code on the browser.

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

[Note: please be advised that this transcript contains strong language]

Transcript

Cantrill: I am going to be talking about Moore's law, namely the end of Moore's law and what it means for those of us who are in the mainstream enterprise. I'm going to assume that all of you have heard of Moore's law, but what exactly is Moore's law anyway? You probably know Moore's law is like, "Hey, I think it's about CPUs getting faster. I'm pretty sure, but which of these is it?” We're going to read each of these. I'm going to want you to all to vote for one of them on and I'm going to warn you that my vote would have itself been incorrect. I'm just curious about which of these you think it is.

Is it that transistor speed increases exponentially over time? Is it that transistors per dollar grow exponentially over time? That is, over time, your dollar buys more and more transistors? Is it that transistor density grows exponentially over time or is it that the number of transistors in a package grows exponentially over time? How many of you think it's one, that transistor speed increases exponentially over time? I'm almost certain it's not one, one is Dennard scaling, which you're right, but also wrong, and we'll talk about that in a second.

Transistors per dollar grow exponentially over time. How many think it's that? A couple of hands go up, maybe some of you've read the original paper. How many of you think that it is three, transistor density grows exponentially over time? A bunch of you. How many of you think it's four, the number of transistors in a package grows exponentially over time? The Intel employees are all here. The only people that really seem to think this are Intel employees, but maybe you're not.

Actually, it's a total trick question because the answer to this is actually at once all of them and none of them, Moore's law and history always have ambiguity. Oh my goodness, the amount of ambiguity in Moore's law, in fact, who even coined Moore's law? Hint, it wasn't Moore. We'll talk about that in a bit, it was coined by a physicist named Carver Mead, very important physicist in his right and a collaborator with Gordon Moore, but it was not coined by Gordon Moore.

Primary Document

When was Moore's law coined? 1965. No, it wasn't, the original paper from which it came from 1965. Let's go to that original paper because that original paper has got so much depth to it. First of all, I love the title, "Cramming More Components onto Integrated Circuits." Cramming, what a great word, great verb there, it just implies that this is somewhat against our will. This is written by Gordon Moore. It's in "Electronics" Magazine as a retrospective. This is published in 1965, Intel was actually looking for a physical copy of this. They were offering $10,000 for a physical copy of this. It's very hard to find a physical copy, what Intel has is a reprint. This is what appears to be a scan of an original copy, this is the original font and so on. To the degree that it's lossy, it's because it's a scan of an original copy, which I don't have, it's on the internet. This is when Gordon Moore, of course, was a founder of Intel, but this was when he was at Fairchild Semiconductor. Fairchild Semiconductor was founded of course by the traitorous eight who left William Shockley, the world's biggest and most famous asshole in Silicon Valley to form Fairchild. The origin story of Silicon Valley is so great that a bunch of people followed this galactic asshole out to Silicon Valley and they all quit and formed their own company, very important origin story for Silicon Valley.

Gordon Moore

This is before Gordon Moore is the founder of Intel, I would really encourage you to read this paper. In fact, I would so strongly encourage you to read this paper that I'm going to damn near read the whole thing to you right now. This is basically going to turn into Papers We Love for this particular paper because I think it's so amazing. This is the opening paragraph, by the way. I have to emphasize this is 1965, the future of integrated electronics is the future of electronics itself. The advantages of integration will bring about a proliferation of electronics, fine in many areas.

This next paragraph is just amazing. ''Integrated circuits will lead to such wonders as home computers.'' Ok, just stop for a second, home computers. This is 1965, I am old, just for clarity. I've got kids in high school, my father was an undergraduate in 1965. This is a long time ago, the stored program computer is not that old and here's someone talking about home computers. This is madness in 1965, the first home computer doesn't show up for another 12 years. This is amazingly ahead of its time. Then it's like Gordon Moore is like, “No, I'm sorry, I'm not even done with that sentence, pal.” You got to read the rest of that sentence. Automatic controls for automobiles, ok, wow. Then personal portable communications equipment. Gordon Moore, are you done predicting all of the future in 1965? This to me is almost more stunning than the rest of the paper. How can you have such a clear vision for 40 years into the future and be accurate? This is the time we thought there'd be interspace travel. We thought we would have a moon base and hover cars, all wrong. This guy nails it and he's not even going to talk about this for the rest of the paper. The electronic wristwatch needs only a display to be feasible today. Amazing.

As it turns out, the publishers of "Electronics," this obviously blew their brain up or they just burst out laughing when they thought, “Home computers. That is absolutely ludicrous.” Then they had what is an idea that I would love to understand. They had an editorial decision that is very strange, which is “You know what we need to do? We need to have a cartoon that accompanies this piece that illustrates how ludicrous this is.” I get the sense, Gordon Moore has not really been interviewed about this cartoon. I would love to ask him about it because this cartoon had to been done without his consent. This is the cartoon that is in the paper and there's a lot here. You're going to want to take this one in, there's a lot going on here.

You've got the kind of Albert E. Newman in the middle selling handy home computers. This is what they envisioned the future to be. First of all, I got no idea what's going on with the cosmetic display, that person's back looks like it's broken over there. There's just something very wrong about the way that she's standing, and then what's going on with the handy home computer? I don't even know what that thing is, it doesn't even look like a cash register. Of course, it's on sale so that's actually not necessarily a good sign. Then you have notions.

What the hell is a notion? That's a chiefly North American, an item used in sewing such as buttons, pins, and hooks. This shows you how old this is, this is something you sell to a frontiersman or frontiers person. This is for repairing your clothes in your sod house next to the handy home computer. This pretty much knocked me on my butt for a while, just taking all this in. I even got to the point where I googled Grant Compton, the artist over there, I want his whole oeuvre, I want to understand all of his work. Anyway, I would love to know the backstory there, but moving right along.

Gordon Moore goes on to say that we're going to make electronic devices. Integrated circuits are going to make electronic techniques available throughout all of society, performing many functions that presently are done inadequately or by other techniques or not done at all. The principal advantages and this is important, will be lower costs. This is a theme that's going to come up again and again in this paper and something that we lost in the years since, this is very important, and payoffs from a ready supply of low-cost functional packages. There's so much prediction of the future that ends up being spot on in this paper.

One of the things he talks about is silicon and silicon is likely to remain the base material and it has remained the base material. It has remained the base material for damn near 60 years since this thing was published. Gallium arsenide existed in 1965 as a semiconductor. It's going to be important, he thinks, in specialized functions, but ultimately silicon is going to dominate. It's a relatively inexpensive starting material, silicon still dominates, it's still a great material. It's still the perfect material, silicon, and its oxide, for semiconductors. This is another great prediction that's just an off-handed remark he makes.

The Base of Moore’s Law

Then we get to the first thing that begins to feel like Moore's law. What he's talking about here is that as we add components to an integrated circuit, we get decreased yields. As we basically put too many components in an integrated circuit, we get decreased yields, but there's a sweet spot in the curve where we get the maximum yield and the lowest costs. At present, it is reached when 50 components are used per circuit, but the minimum is rising rapidly while the entire cost curve is falling, and see the graph below. We'll look at the graph low in a second, but he says if we look ahead five years, a plot of costs suggests that the minimum costs per component might be expected in circuits with about 1,000 components per circuit. In 1970, the manufacturing cost per component can be expected to be only a 10th of the present costs. This is where you get what looks like Moore's law. This is actually the first graph in the paper and I think it's really interesting because it's showing two axes. We've got the number of components per integrated circuit down there on the X-axis, but on the Y-axis we have the actual cost. We can see for each one of these generations that there's this bathtub curve and as you increase the number of transistors on that IC, you get to a sweet spot. The sweet spot itself is dropping, the cost is dropping. I think that that is critical to Moore's law. I think that or what we should think of as Moore's law is the fact that the cost is dropping, not just that the density is rising.

This feels like this must be Moore's law, then he goes on and he says, and this is what becomes much more famous of course, is that ''the complexity for minimum components, the cost is increased at a rate of roughly a factor of two per year.'' Then he says, ''See this graph on page 116.'' This is the graph we think of as the Moore's law graph, but note in particular what is absent. Now we have time on the X-axis and we've got density on the Y-axis, number of transitions on the Y-axis, cost is gone. We are no longer talking about cost in this graph. I think that it's an important detail that cost was actually the motivator here and we should not lose track of the cost. In this graph and in the one that is reproduced when people talk about Moore's law, cost is absent and I think that that is a very important distinction.

One of the things that he also talks about is how we're going to actually make these things. We're using photolith in 1965 and he says, "Eh, it seems like photolith is going to continue." The density of components can be achieved using the present optical techniques. It does not require more exotic techniques such as electron beam operations. Electron beam lith[ography] has been a couple of years away since 1965. We were using photolith in 1965 and we're still using photolith today and actually, that's going to be an important issue for the end of Moore's law.

Moore’s Law?!

This honestly blew my mind as I was reading the paper, you’re reading the paper and you get to Moore just random prognostications and he's talking about power and what does it mean to power up these actually increasingly dense transistors. He points out that actually, as they get denser, they take the same amount of power and then you get that last sentence. In fact, shrinking the dimensions on integrated structures makes it possible to operate the structure at higher speed for the same power per unit area. That's Dennard scaling, he's actually pointing out, “By the way, I think these things are faster as they get smaller. “

Dennard formalized that in 1974, so that was formalized nine years after the fact. I don't know to what degree this was just common knowledge at the time that these things would get faster as they got smaller. That's why you would have been not necessarily wrong to say that Moore's law affects the actual speed of transistors over time because he's actually pointing that out that these things are going to get faster, they're going to get denser, and importantly they're going to get cheaper. To me, when we think of Moore's law, we should think of all three of those things. I think it is a disservice to this original paper to chuck one of those originals.

I don't get to decide what Moore's law is, although I'm not sure Gordon Moore has decided Moore's law is either. I go back to the original text, Moore's law was actually not coined by Gordon Moore, it was coined by Carver Mead, who was a collaborator with Gordon Moore some six years after this when he was talking to a journalist and Moore's law occurred to him and the journalist grabbed onto that. It's like, “That's a catchy thing. We should talk about Moore's law.” Interestingly, they talked about Moore's law specifically as a way to motivate engineers to build that next thing, to keep people looking to the future and trying to do this thing of doubling transistor density every year.

Moore actually modified the law in 1975 to be a doubling of transistor densities every two years. I would have sworn, in fact, did swear that Moore's law was a doubling of transistors every 18 months. As it turns out, that's wrong, that was an Intel exec, just like doing the classic executive thing of like, you said two years, I figured 18 months is close to two years. It's just a little more aggressive, it's like, you promised it in 18 months, it's basically the same thing, come on, you're not going to be late.

According to Moore, it is a doubling of transitions density every two years and Dennard outlined his scaling in great detail in 1974. For many years it was entirely reasonable. This is why I think you couldn't go wrong or right in that original vote about what Moore's law is because for many years it was entirely reasonable to believe that Moore's law was actually a doubling of all of these things, density, speed, and economics. It's true that the 1980s and '90s were great for Moore's law. If you're of my vintage, you've got your own Moore's law, you've got your own story, and if you're younger then, sorry, you just don't have this, but it's true.

Good Old Days

My first machine, like many of my generation, was a 4.77 megahertz, IBM PC XT. When I was in high school, I had a 16 megahertz, 386 SX. When I got to college, it was a 25 megahertz Super SPARC. When I was a senior in college, it was a 40 megahertz Super SPARC. When I went to my first job at Sun, it was 167 megahertz, actually, it was 143 megahertz Ultra SPARC because it was a broken part they had to clock down, this is what you give the new person, I guess. It was 143 megahertz Ultra SPARC and then it was a 300 megahertz Ultra SPARC and so on. Then it was an x86, even faster x86 and so on.

It was true that those were great years, it was amazing that with every generation, things were getting faster. It is a myth that we software engineers were simply relying upon that. I never heard once someone say, “Well, let's not make this faster. I don't know, let's just wait for Moore's law to make this faster.” Nobody ever actually said that because honestly, even in those days, and I think this is the important point, even in those halcyon days pre the limits of Dennard scaling, it wasn't true that your software always got twice as fast.

In fact, most software didn't. Why? Because even at that time we were hitting the memory wall. The CPU was getting faster, memory was not getting faster at all and increasingly our applications were actually blocked on memory. We knew that Moore's law would not help you with that memory wall. Moore's law could add more caching, but caching can only do so much. It was clear to us at the time, those of us in the industry, but especially those of us at Sun, that symmetric multiprocessing was the only way forward. You had to get multiple CPUs in the same enclosure such that they could be executing multiple contexts in parallel. That was the only way to deliver throughput, that's what we collectively did. That's what we did during the '90s and 2000s, we made SMPs, we made multiprocessors.

SMP didn't help single threaded performance. Having multiple processes in the box does not make a single thread faster, but there were some crafty ideas and dangerous ideas to do that. In particular, there was this thing that would really come back to haunt humanity called speculative execution that would allow us to work past elements of that memory wall. Speculative execution is the microprocessor guessing what you're going to go do next and performing loads that it thinks you might need to do in the future. As it turns out that speculative execution became a side channel and became my annus horribilis in 2018 as we were dealing with all the side channel attacks on x86.

These things would come back to haunt us, both speculative execution and all the tricks around single threaded performance and the ILP wall and SMP, these were fraught with peril. Don't look back on these times as “Wow, that was just awesome times when software engineers can just sit back and everything went faster by itself.” That is definitely not what happened, these were tough times. It was a challenge even then, and even the good old days were a challenge. Then, of course, the good old days began to end.

In about 2006, earlier if you were Sun, Dennard scaling ended and we were no longer able to make smaller transistors faster. Why? Because you would have current leakage, you have current leakage across the gate. As these things become smaller and smaller, you're leaking more and more because that leakage is effectively constant and now, all of this current is being leaked and you've got this thing that has got this ridiculous power density.

Beginning of The End

Dennard scaling was over, but that was not the end of Moore's law. It was at this point in time that I, and we, and everybody was busily redefining what Moore's law is. No, that's Dennard scaling is over, Moore's law is still fine. Moore's law is transistor density, which is true or has truth to it. We certainly had other ways of using that transistor budget. In particular, we started using chip multiprocessing. Chip multiprocessing, putting multiple cores on a die, putting multiple threads within a core with simultaneous multithreading. There were people at the time that were, “Oh my God, it's the end of days. It's the end of software as we know it because we're all gonna have to be multithreaded programmers.” A bunch of us were pointing out that software, as you think it is, has already ended because we actually hit the memory wall a decade prior and many of us were writing multithreaded code by that point in time, one.

Two, I think one of the traps we fall into frequently is when we think about the machine, we think that our program is duty-bound to use every unit at all times across the entire machine. There's this thing called multitenancy as it turns out. You're not the only person in the building, you don't need to be in every room at once. We can actually have multiple people in the building, it's amazing. We can actually have multiple programs that are running at the same time. That allows you to be single-threaded, and yet we can still utilize the box by having multitenancy. This was not the end of the days for software that some people thought and it was much less of an apocalypse than some people feared. In part, because we had all this experience with SMP and in part, because we had other ways of using the box, but there was a threat of dark silicon.

Dark silicon, the inability to utilize all silicon with chip multiprocessing, that you'd hit something called Amdahl's law. Amdahl's law says that as you increasingly paralyze your work, you're more likely to get blocked on those single-threaded aspects of your system. Those bottlenecks begin to dominate, and the concern was that we'd all get Amdahled with CMP and it wouldn't actually deliver its promise. CMP was not going to be a panacea.

Crossing the Rubicon

Then Moore's law crossed the Rubicon and it crossed the Rubicon of economics sometime around 2014, 2015 when we went from 28 nanometers to 22 nanometers, so we're continuing to shrink and shrink. Notably, when Carver Mead did his first analysis of what is the smallest we could possibly make a transistor physically, his answer was 150 nanometers in 1965. We're beyond what Carver Mead thinks we can go do, we crossed this critical Rubicon of economics.

This is a data generated by this guy, Handel Jones, who is this independent consultant in Silicon Valley who gathers a lot of really interesting data. This is not data that's easy to get, this is looking at the cost per gate trend as we go to different processes, from 90 nanometers down to 65 nanometers to 40 nanometer. We can see Moore's law holding nicely, then we arrive at 28, it begins to plateau. Then importantly at 20 nanometers, the cost per gate rises and rises again to 14 nanometers. This is because the problems there are really nasty.

As we moved to 20 nanometers and beyond, the metal layers got so thin, we had to go do something called FinFETs. I say Fin, field effect transistor, a totally different process that needed to be pioneered. It got very expensive and that's where I think Moore's law, I think it's more aptly called Roshomon's law because everyone's a different definition of it, but everyone thinks that they're still adhering to Moore's law, but they've lost this economic definition. Moore's law in my mind was not, should not have been highest density at any cost. That's not what Moore's law is, that does not make sense. You've got to have that cost factor in there.

End of The End

We've now had crossed the Rubicon with respect to Moore's law and then we hit the end of the end. When history is fully writ, it may be in an August of last summer that we actually hit the end of Moore's law when one of the four foundries, GlobalFoundries stopped at seven nanometer development. It didn't stop seven nanometers development because it couldn't do it, because GlobalFoundries couldn't do it. They did it for economic reasons, it was too expensive. What they also realized is it was going to be very expensive for their customers. For their customers to have a seven manometer part was going to be very expensive. They pulled out of seven nanometers and that left only two foundries at seven nanometers, TSMC and Samsung at seven and Intel, the other foundries, so those are the four foundries, those are humanity's four foundries, now three at seven nanometer. Intel was at 14 trying desperately to get to 10.

Cannon Lake was their 10 nanometer project, Cannon Lake was supposed to ship in 2016. Cannon Lake is just really sampling this year, that's not a failure of Intel. I think that that shows how very difficult it is at this level, you're effectively having to deal with new physics every time we drop and 10 nanometer has been really difficult for Intel.

Intel - I don't blame them for this, I understand it, it's just very painful for the rest of us - in the isolate Cascade Lake timeframe, which is basically in the next couple of years, they're going to intermix 14 nanometer and 10 nanometers parts, which from a cloud perspective here is, “Oh please, what am I supposed to do with this? What do you want me to do with this?” I'm going to have different processes and different parts and some have AVX512 and some don't. I don't blame you for shipping this thing three years late, but it's making my life very complicated. It's making all of our lives very complicated and it makes it very complicated to use some of these software structures.

It's a big mess because this is how Moore's law ends. Moore's law ends the way life frequently ends, in confusion and dementia and it's not good, it's bad. It's going to get even crazier. Joyent is owned by Samsung, but I emphatically do not speak for Samsung when I say this, Samsung has said that they are moving to three nanometers, and they're going to move to three nanometers. You can no longer use FinFETs at three nanometers, you need to move to what's called the gate all around field effect transistors, GAAFETs. I don't know if anyone says GAAFETs. Maybe everyone's just used to writing that one out, that does not sound very sophisticated, but it is very sophisticated. In particular, we've got to move to EUV, extreme ultraviolet photo lith. We've got to use a new lithography, we've got to chuck FinFETs and moved to GAAFETs, there's a lot of unsolved problems here.

Really, The End

By the way, this is to get to three nanometer. You may ask, wait a minute - I'm a software person - how big is a silicon atom anyway? Glad you asked. A silicon atom is 0.2 nanometers, three nanometers is 15 silicon atoms wide. You need more than a silicon atom, no joke, this is the end. This is actually the end of the end. I don't know if you only narrowly interpret it as transistor density is ending, if you interpret Moore's law is density at any cost, I don't think it is density at any cost. I think three nanometers or five nanometers is going to be outrageously expensive to the point that I really question that we will ever get to three nanometers or five nanometers. I think we will be at seven nanometers for a long time. In fact, we may be parked at the seven nanometer node forever. That may be the end of making transistors smaller and economically, Moore's law is indisputably over.

It's the apocalypse, Moore's law is the reason that we've had all of these great developments, the home computer and the phone and everything else is all because of Moore's law. We should all give up and go home? No, that is not what it means. Let's talk about what the end of Moore's law means because I think it is as exciting as it is anything and there are a lot of new things to consider.

Quantum Computing

Let's get some things we can consider quickly and then move on. Love quantum computing, not going to be meaningful for us who write commercial code anytime really soon. It's surprisingly real, IBM's the queue system one is amazing, it's an amazing feat of engineering. The problem domain is really limited and we don't know the economics yet at all. We do know that it requires a ridiculous refrigerator, the world's coldest refrigerator is this quantum computer because you need to get these things close to as absolute zero as possible.

I'm a software guy but that seems like an expensive refrigerator. I haven't seen the bill on that one, but I know how much it costs, I can't seem to do any kind of plumbing project at home without incurring at a ridiculously expensive bill. You're talking plumbing, I think that one's going to be expensive. I don't think that we're going to see this in broad use. In particular, we don't have a Moore's law yet for cubits. We don't have the ability to scale cubits yet. IBM, Rigatti and others are really trying to motivate themselves to be able to double cubits every year because they see the power of that for Moore's law, but it is a really hard problem. I don't think it's going to be irrelevant forever, I think it may very well be relevant. It's not going to be relevant anytime soon.

Specialized Compute

What about specialized computing? Specialized computing is already here on the form of the GPGPU. I would assume many of you use the GPGPU for some aspect of your computing, the general purpose graphics. The GPU, you probably use it, maybe you use it for deep learning or machine learning. We already had the GPGPU, it's already important and it's problem domains. We are using ASICs for a video encoding and decoding. We use ASICs for Bitcoin mining. Sadly, Bitcoin mining is a huge amount of this special purpose compute. Turns out Bitcoin mining requires a lot of GPGPUs as well. I think Nvidia was also surprised about how much of their business was Bitcoin mining, when all of that went away.

We already actually do use these, and we will continue to use these, but we should not think that it's all going to be specialized hardware and all of my Ruby is going to be turned into an ASIC. No, emphatically not, that is not going to happen because for these, design cycles are still long for ASICs and FPGAs. It's hard to do, it doesn't make sense for many problems. Even where it does make sense and you can get a nice big pop and you get a nice big pop when you do an FPGA, and especially an ASIC. I don't know how much we're going to see FPGAs because the performance pop is not nearly as much as in ASIC. When you're using an ASIC, you do get a nice big pop, but then you are going to be on the same process that we're limited by in terms of Moore's law, you're going to be on that 7 nanometer process or maybe a 14 nanometer process and you're not going to see another Moore's law advantage for those parts. Those things are going to hit on what has been named the accelerator wall.

This is not a panacea, this is for some of the applications, we will see an increased use of specialized compute, heterogeneous compute.

3D

How about 3D? 3D is just what it sounds like, it's building up. Historically, our ICs have been entirely planar. What about going up? Going up makes a lot of sense for NAND and what we actually have done, 3D NAND is on an older CMOS node, they're on 14-nanometer and 20-nanometer. 3D actually, when you're talking about CPUs in 3D, there are a lot of heat problems, yield problems, and costs problems, 3D is not going to save us.

Intel has very actively pushing this forward with something called Foveros. 3D is important, but we should not view 3D as somehow getting us back on track for Moore's law. This is why I asked if you think that Moore's law is actually transistor density in a package. This is why Intel claims this because they're like, “No, we'll make it 3D” and we're like, “We're still abiding by Moore's law.” It's ok, Moore's law is over, it's fine. We can mourn Moore's law and move on. This is interesting, but it's not going to save us.

2.5D Chiplets

The thing I think is actually much more interesting is what is called 2.5D. This is where we're going to fab little chiplets that are going to be smaller than the historic dies, smaller than that 300, 400 millimeters, square millimeter die. We're going to make smaller chiplets and then we're going to put it on a silicon interposer. We're going to put it on a large die, this is interesting because it allows for different functions in these different chiplets. You could have some number of CPUs, GPUs, other specialized functions, and importantly, they don't need to be the same process.

AMD, for Epic and for Rome, they got some of these that are using seven nanometer. Then for the I/O units for example, they're 14 nanometer because they don't need to be 7 nanometer. That is really interesting because it allows us to get to that economics and that economic promise of 7 nanometer without having to revisit the entire die. Intel's investing in this as well, I think this is actually a very, very promising avenue and you're going to see some great microprocessors coming out in the next couple of years based on these 2.5D chiplets.

Alternative Physics

What about alternative physics? I guess I should have put in this earlier, I should have put this right after quantum computing. I love alternative physics, and so should you, we should all love alternative physics. I'm talking silicon photonics, carbon nanotubes, phase change memory, there are a ton of these. We should explore them all, they are all beautiful, fall in love with all of them. Well, actually watch yourself, be careful about who you fall in love with because these things will break your heart.

I totally fell in love with a company called Nantero. Nantero is making carbon nanotube-based memory. Carbon nanotube-based memory is amazingly promising. It's 100X the speed, a 1,000X the density and nonvolatile in respect to DRAM. You're like, "Whoa, that's a game changer." They've been working on this since 2001, they're still around, they are still plugging away. A venture-funded startup, that is like a long venture-funded startup. Love Nantero, please, Nantero, we're all rooting for you. I get a sense that carbon in the clean room is becoming a problem when we talk about productization.

These things are great. It is really hard to break through, in my career and probably in your career, we've seen one thing break through and that's flash memory. The rest is basically the substrate we were building on in 1965, more or less. It is really hard to break through, watch them, cheer for them, but definitely don't assume them.

Wright’s Law

Then there's this thing called Wright's law, which is interesting. In 1936, this guy named Theodore Wright studied the cost of aircraft and notice that the cost of aircraft would drop the more aircraft he made. Every time the production doubled, the cost of making an aircraft dropped by 10% to 15%. Jessika Trancik and her team at MIT and then at the Santa Fe Institute found that Wright's observation was more accurate than Moore's law for predicting what semiconductors have done over the last 40 years, 50 years, which is pretty interesting.

I think that this could hold true, I see a lot of promise for this that as we begin to consolidate on that seven-nanometer node, our costs of production will probably go down. If you look at older nodes, older nodes are way cheaper, 28 nanometer is dirt cheap at this point. RISC-V is all on 28 nanometer and that was state of the art not that long ago. It is entirely reasonable that this stuff is going to start getting cheaper again.

That is going to be really important because and I think this is going to be the greatest single consequence for us in this room for software about the end of Moore's law is I think you're going to see compute in a lot more places.

Compute Everywhere

We're going to have a lot more computers, more CPUs sitting in stranger places. We're already seeing this on the NIC with so-called SmartNICs, we're seeing CPUs next to flash and yes, there have always been a CPU on the NIC, but it was ridiculous, it was an ASIC that was designed for that activity or it was a ridiculously underpowered CPU. I'm talking real CPUs sitting next to flash, real CPUs on the spindle.

Western Digital is doing some really interesting work behind RISC-V. They've got their swerved part. That's a 28 nanometer part, you are going to have potentially real compute sitting on your spindle, real compute sitting next to your flash, real compute sitting next to your NIC in addition to your host CPUs. These are going to allow for some really interesting abstractions. If you're young, that may just excite you, if you're old like me, you have equal parts excitement and terror as you think, we're going to have code sitting on the spindle, what happens when that fails?

As you're pushing code to these parts, it just feels, oh my God, the failure mode, the malware, the inadvertent effects, the deliberate effects. This is fun and interesting and terrifying because if we leave this unchecked, we really have terrifying security ramifications. There's a lot to think about here, but a lot of this stuff is going to happen or it is happening, actually. We need to understand what's happening so we can prevent it from happening incorrectly.

Durable Computing

When we think about Moore's law and being at the end of seven nanometer, well, maybe we should have a CPU that lasts longer than three years. How about experimenting with that? How about we just stopped throwing them out? Wouldn't that make it cheaper if we just stopped throwing them out and we let them run longer? How long can a CPU last? We don't actually know the answer to this question. We at Joyent, and I'm embarrassed to say this, did not find this out the right way, but we had machines that were going to be in production long after they should've take taken out of production and we did not see any increase in failure after even a decade of running 24/7 in production, any increase in CPU failure.

CPUs, they're semiconductors, they can last a long time. How about we make some more durable computers and let's look at some other axes of improvement. Let's not just look at speed, let's look at density, let's look at power. Let's look at how do we cram more CPUs into our data center? Maybe that's the Gordon Moore paper that should be written in 2019 is cramming CPUs into our data centers as opposed to cramming ICs onto a chip. Look at the Open Compute Project, which has got some really interesting things, and OCP in particular, I would guide you to the OCP Yosemite V2 and their Xeon D. Xeon D is actually a step down from the top of the line Xeon, but it is lower power and still high performance, really interesting stuff going on there.

Beyond Moore’s Law

Yes, this is the change. The end of Moore's law is a change, but the truth is that we have been dealing with these changes through our entire career. It is not an apocalypse, it's going to be fine, I swear. We're all going to hold hands and we're going to figure out a way to solve these interesting problems together. The future is exciting, the complexion of the problems is going to change. We are, in this room, we are going to care more about efficiency, we are going to care more about density. We're going to care more about how much power we consume. Yes, that is all true, I think the future remains alive and promising. It's really interesting.

For us in software, there are lots of interesting opportunities to understand the lower level consequences of the system, to understand what lives beneath us, the hardware beneath us, and to optimize for it. Beyond Moore's law, I think that as we think of the computer of the future, the computer of the future is much more complicated. It's very interesting and we need to think of it systematically when you need to think of the hardware and the software operating together.

See more presentations with transcripts

Recorded at:

Jul 16, 2019

Bryan Cantrill

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?