BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations Resilience in Supply Chain Security

Resilience in Supply Chain Security

Bookmarks
38:32

Summary

Dan Lorenc goes over real-world threats facing open source supply-chains today, and what can be done to architect resilient build and delivery pipelines.

Bio

Dan Lorenc is a Staff Software Engineer and the lead for Google’s Open Source Security Team (GOSST) He’s been working in the Cloud space for 8 years and has mostly focused on open source tools related to building containers easily and securely. He founded projects like Minikube, Skaffold, TektonCD and Sigstore. Dan regularly blogs about supply chain security and serves on the TAC for the OpenSSF.

About the conference

QCon Plus is a virtual conference for senior software engineers and architects that covers the trends, best practices, and solutions leveraged by the world's most innovative software organizations.

Transcript

Lorenc: My name is Dan Lorenc. I'm a software engineer at Google, on our open source security team. I'm going to be talking about resilience and the role it plays in supply chain security. If you've been following the software news lately, you've seen that there have been a ton of supply chain attacks. It hit everything from governments, to large companies and small companies alike. Open source has played a crucial role in preventing and causing a lot of this. This is a topic near and dear to my heart. It's a space I've been in for a while. I'm excited to talk about this, particularly how resilience plays in.

I've been at Google for about nine years now. I've been on Google Cloud Platform pretty much the entire time since before it was called cloud. I've worked in pretty much every level of the stack here, from high scale backend systems, all the way up to what I currently focus on, developer UX, so tools to make developers' lives easier. I've been lucky enough to be working in open source for about the last six years. That's what got me started worrying about supply chain security, particularly in open source. I've been worried about this and trying to build tools to make supply chain security easier for the last few years now. It's awesome to see it in the news now and to have people start paying attention to this topic.

Outline

I'm going to start out by talking about what the problems are that are facing open source security, and open source supply chain security. If you're a software company today, open source is probably in your supply chain somewhere, whether you know it or not. I'm going to be talking about the problems you're facing, and the problems that the open source community is facing as well, and how we can deal with some of these together. I'm going to talk about resilience. I'm going to cover why resilience is important here. What role it plays in supply chains, and how you, as the user of open source software can architect a more resilient software supply chain.

Part 1: The Problems with Supply Chain Security

I'll talk about what the problems are with open source supply chain security. I like to break it down as two problems. I think there are two problems facing security of open source software. The first problem is right here in the name. The first problem with securing open source is that it's open source. Open source is great. It's open. It's free. Anyone can contribute to it. The communities are powerful. It does pose some unique problems when it comes to supply chain security. Finally, open source software is just like the rest of software, it's software. All software has bugs, whether it's written by me, whether it's written by you, whether it's written by a large community of maintainers. All software has bugs, whether we like to think about it or not. These are the easiest ways to break down the two problems that I see. I have different ways of addressing it, so that's why I like to break them down this way. Here are some examples of each of the problems next, and then some of the things we can do about them.

Problem 1: It's Open Source

First, it's open source. What does that actually mean, and what do USB thumb drives have to do with that at all? I like to use this metaphor a lot. I think it starts to really convey the problem and worry people a little bit in the right direction. I hope most people at companies or not have been trained and scared by the news enough to know, if you're walking down the street, and you find a thumb drive on the ground, you should not pick up this thumb drive and plug it into your computer to see what's on it. This is a classic supply chain attack. I hope most people know not to do that. If you don't, then public service announcement, do not pick up random USB thumb drives on the ground and plug them into your computer. This is a great way to get compromised. This does happen. There have been a lot of supply chain attacks carried out this way, including some that even got into air-gapped environments, and took out nuclear reactors in certain companies. Do not do this. Do not plug thumb drives into your computer. Definitely do not take one of these into your data center though, and plug one of these into your servers. That would be even worse than plugging it into your laptop.

What does this have to do with open source software? I want to step back a bit and ask, what is the difference between plugging one of these into your computer and running npm install? There's not really a lot of a difference. In both of these situations, you're taking arbitrary code that you just found lying around on the sidewalk, you have no idea who wrote it, you have no idea where it came from. You haven't looked at it. You're executing it on your own computer. In fact, the second half of what I just started saying about plugging one of these into your data center in production is basically what's happening with npm. It's even worse than plugging a thumb drive in. If you're plugging the thumb drive in, it's at least limited to your laptop, usually. When you're taking arbitrary package managers, installing code from them, and running them in production, you're giving it more privileges in a lot of ways than these thumb drives. This should be terrifying to people if that's how we started building software, which is taking code that we've never looked at, and running it with access to our most confidential information. We really don't have many protections in place for this today. Open source software is great. It lets us build applications much faster. It lets people collaborate together. It is pretty scary when you have no idea where the code is coming from.

Opportunities of Open Source

What else can this cause? Here's some other ways I like to think about it. Open source is awesome, but a lot of these benefits can cause problems. Let's look at some of these benefits, break it down. Then after this, we're going to do some actual examples of each of these compromises. Some of the problems here. First off, open source is free. That's awesome. It's one of the main reasons people use it. If you look at this comic here on the right, sometimes free isn't worth it. We've heard of free as in beer, free as in speech. I say sometimes open source is free as in lunch. You're paying for it somehow, whether or not you're writing a check upfront. A lot of maintainers of open source software aren't supported. A lot of open source software doesn't even have maintainers. In some cases, this can cause problems, this whole digital infrastructure thing on the right, standing on the back of one person who's been maintaining something thanklessly. You don't know who they are. You might not even know how to find out who's maintaining some of these things. If that person moves on, and stops maintaining software, doesn't merge security fixes, that's a problem. We don't really ever get to complain to that person. We've never rewarded them or paid them or compensated them for their work. It works out great in a lot of cases. Sometimes that free lunch comes back to get you later on.

The other benefit here, anyone can contribute to open source. That's why we get so much done together by collaborating. Fortunately, I hope this isn't news to people here. Anybody that's spent time on the internet should realize this. Not everyone on the internet is nice. A lot of people have bad intentions. There are a lot of attackers and hackers out there that submit bad software. They do typosquatting, where they publish packages under fake names and try to trick people into installing it. People send malicious code on purpose trying to get it into popular packages to exploit it later. Anyone can contribute. That also works out great just like open source is free, but sometimes people can take advantage of this. We need to figure out how to prevent people from taking advantage of the welcoming communities in open source.

Open source is easy to use. The great developers in the great open source communities made it super easy to install and find and discover useful packages. That's how communities in open source get built. Sometimes you can pull in, in your code without realizing it. It's too easy to pull down thousands of dependencies when you're only trying to install a couple, and it's hard to keep track of what's happening. This has led to numerous supply chain attacks lately, developers are adding code they didn't know they were adding, that was compromised.

Finally, this last one is a misnomer. A lot of people tend to think that open source is transparent. It is. You can see the source code usually. Most people aren't looking at the source code, definitely not in the way that we're consuming open source. If you're installing binaries, you're grabbing pre-built packages, you're installing Android applications on your phone that are compiled versions of open source software, it's actually opaque. You can't reverse engineer those binaries. In a lot of cases, you don't even know if those binaries came from the open source software you started out with. It gives the appearance of transparency, but unless you're taking advantage of it, and ensuring you're building everything from source code yourself, it's actually opaque in reality.

Real World Examples

Let's cover some real world examples of all those attacks I just talked about, when it comes to cheap and free. The Great Suspender is a Chrome extension that was developed on GitHub. It's a very popular one, for all the people out there watching this that have thousands of Chrome tabs running at a time. I can't handle that myself. I start to garbage collect on my own once I get past five or six. I've heard that if you have thousands of them open, it decreases your battery life. The Great Suspender was designed to solve that. It was able to suspend Chrome tabs in the background that you weren't using to save battery life and let your computer run more efficiently. This was maintained for free on GitHub by one maintainer, who eventually moved on. This maintainer sold the rights to this extension, as I think people do all the time, they sell popular code to a person who wanted to buy it. Immediately after, that person stopped publishing builds from GitHub, and actually inserted some malware into this. We had this popular package installed in browsers all over the world, and it was not compromised. The person walked in and bought it for pretty cheap, because the maintainer wasn't doing this as a full time job. This can keep happening. When you look at larger, more sophisticated attackers, the cost to buy the rights to small packages like this, pennies. It's nothing. These attacks are happening and they're going to keep happening because these products aren't supported by large companies with large budgets. Bad people on the internet. This one's a little bit tricky, a little bit different, but it did show the possibility that the code can come from anywhere.

This was all over the news recently, again. A research group at the University of Minnesota tried to show the feasibility of sneaking bad commits into the Linux kernel. This resulted in a pretty controversial paper draft that was withdrawn. The Linux community didn't like being experimented on, but it does show that this is possible. The Linux community acknowledged it was possible. Their review process actually caught all of these, but less thorough communities might not catch some of these malicious commands being snuck in. Don't do this as an experiment, but do take away that this is happening and can happen to the open source you use.

Easy to use. Here's another great one, in February of 2021, the popular Dependency Confusion attack. You can just type a command and install something on your computer. A lot of companies run internal mirrors to try to create a choke point where all the open source coming in, is first funneled through these mirrors. These clients, the command line clients do such a good job of trying to install things that if it wasn't found in the mirror you tried, it would fall back and try the public mirror. An attacker was able to trick a bunch of these clients into downloading external packages rather than the internal ones they thought they were. Luckily, this was a researcher and not an attacker. They were able to get packages and arbitrary code execution in a lot of large companies, and were hopefully rewarded with a very large bounty.

Transparency. This was another great example. People look at open source code, they think it's fine. Then it gets bundled up into a package manager like RubyGems, or PyPI, or Distro package managers. It's not necessarily the same thing. It was in the GitHub repository. This is an example of a RubyGem that added some malicious code as part of the install script, to set up cryptominers. There are a whole bunch of these packages that keep getting found. They keep getting discovered by automated tools and security researchers, and taken down. Back in December, there were a couple hundred packages that are mining crypto on people's laptops after they were installed. This keeps happening. If you look back through history, you see a ton of these crypto examples. In a lot of cases, crypto mining is a blessing in disguise, because it's much better than having a ransomware attack taken out on your computer or data exfiltrated. You just have to look for cryptominers.

These were some big attacks. There have been some huge ones recently too. Another important thing to note is that these are just in the last six months. These aren't stretching back very far. Lots of examples here of attacks, everything ranging from language package managers to operating systems, because of some unique properties of open source.

Problem 2: It's Software

First, it was all unique stuff about open source. The second problem here is that even if you factor out all the bad things, even if you factor out all the malicious attackers inserting bugs at every step in the supply chain, the second problem is that it's still software. It's written by humans. All software has bugs, really. The software I write has bugs, whether it's open source or closed source. The software you write has bugs. We can find them together over time, but it does take that time and it takes people looking. We're never going to find them all. Some of these bugs can cause security issues. That's the problem. We haven't found them all. There's too much open source out there to do that. A lot of these bugs with security issues can get exploited. We can't treat open source completely differently, we have to treat it just like the rest of software.

Examples of Bugs

These ones are probably more well-known than the supply chain attacks I talked about. These ones get fancy logos. They get press releases. A lot of them need CVE disclosures, and they get headlines. We've got a bunch of examples here. Heartbleed was one of the most popular ones that woke people up to the whole issue of maintainer is not being paid, not being supported. Not as many people working on these critical packages as we thought. Heartbleed was a bug in one of the most popular most commonly used crypto libraries, OpenSSL, that was securing and ensuring privacy across much of the internet at the time. After this happened, it was fixed. People realized there just weren't many people working on OpenSSL even though everyone was using it. This ranged all the way up the stack too.

This other example here, is all of the JWT parsing libraries that are written in open source. These aren't audited. These aren't reviewed. This is really tricky code to get right, yet huge companies rely on them all the time. This website, how many days since the last alg=none JWT vulnerability, keeps track of how long it's been since the last time one of these bugs has caused a CVE. This has been happening for years now, but the counter keeps resetting. These bugs are everywhere.

Zero Days

Everybody knows about the big, flashy Zero Days. It's hard for humans to plan around those. You don't know when the next Heartbleed is going to be. We don't know when the next cryptographic advance is going to render all of our encryption obsolete is going to be. We really need to talk about them. It's hard to come up with plans for them. These are what people are probably worried about when they think about CVEs in open source software, because they get so much news and attention. My advice for these, have a plan in place to deal with them. Think about them at your company. Think about what you would do if there was a vulnerability in all the big, important packages that you use, all the different layers of your stack. You can have fun with this. You can wargame out these scenarios. You can write playbooks. There's not really much else you can do. Think about how you would recover from them. You're going to be caught surprised though. The whole industry is going to be caught surprised by this. We're all going to have to fix them together. You can get ahead by thinking about it ahead of time, but you can't really prevent or plan for every single one. These are black swan events.

The ones I like to tell people to worry about more though, especially when you're trying to make resilient supply chains are known bugs. These are the boring mechanical fixes. These are updating years later and finding all of your systems, from the flashy ones we just talked about, and even some of the boring ones. You have to fix every single one of your services that's exposed to the internet, constantly, all the time. You can't lose your vigilance, otherwise an attacker can get in. Supply chains are huge, complicated, and vast things. They're going to be released in your supply chain. You need to prevent as many of them as you can. You need to put processes in place to prevent these things. They're not as exciting, but someone at your company needs to do it. The easiest way to strengthen your supply chain here, rewarding and automating this work. Reduce the toil. Make it easy. You do not want to get caught by these. These are the embarrassing ones when you forgot to update a piece of software that's had a known CVE in it for five years, and an attacker or a script finds it and uses that to get a foothold and pivot around inside of your company. You need to have a plan for the first news of Zero Days. You need to make sure you don't get caught by the old known bugs. These are what you should be worried about.

Part 2: Resiliency - What Do We Do?

That's a good topic to transition into the second part here of the talk, resiliency. The first part of the talk was designed to wake you up a little bit, show all the different possibilities, all the ways things can go wrong in your supply chain, because they're so complicated. What does resiliency mean here? What do we need to do? This is a perfect metaphor here. This is a rubber band chain. Supply chains are as strong as their weakest link. If you pull on a chain too much it'll snap. When you're trying to build a resilient supply chain, if you're trying to build one that doesn't snap completely, then build one that can bend. You want to build one that's resilient, and can come back from failure, and can come back from being stretched and exercised. These bugs are going to happen. If we're using way too much open source software, vulnerabilities are going to happen. You're going to have CVEs in the code you're running, no matter how hard you try to keep them out. Should we rewrite every single line of software completely bug free, and 100% test coverage, audited every single day by professionals, formal proofs for every algorithm? You're going to get bugs in your code in production that come in through your supply chain, no matter how hard you try. I'm not telling you not to try to prevent these things. I'm just telling you to plan for bugs, plan for security incidents. Have a plan to recover from them. That's what separates the companies that do well and the companies that don't handle these incidents correctly.

Overall Scope of the Problem

Open source is everywhere. Supply chain attacks are everywhere, really. You need to pay attention. You need to wake up to these. You need to plan and figure out how to recover from them when they do happen. Open source is growing. It's not going anywhere, and supply chain attacks are happening more and more. Supply chain attacks aren't a new concept, but they are rising. That's because we've done such a good job locking down all the other ways in. We're now using encryption everywhere. Developers are now turning on two-factor authentication, with strong passwords. We've locked down the rest of the doors of the way in. Supply chain attacks are the easiest way in for people. They haven't really gotten easier. Open source has spread so there are more ways attackers can come into your supply chain, and all the other doors are getting locked and tightened up enough that this is becoming a more attractive option. Open source is still increasing in spite all of this. Even if you think you've locked everything down today, a year from now, it might not be the case when your open source usage has doubled, and your supply chain has gone up exponentially from there.

Takeaways and Tips

The first philosophy is when you're thinking about how to make a resilient supply chain, is how to stop an attack from getting worse after it happens. You're going to have a bad day. Somebody is going to lose a password. Somebody is going to leave a laptop on an airplane. You want to prevent that bad day from becoming a bad year. What does that mean? You want to prevent something from spreading. You want to detect it early. You want to prevent an attacker from pivoting. You want to do some basic hygiene, to make it so that a single compromised instance can't be turned into a long term compromise and pivoted out to the rest of your company. These are basic things. Nothing here should be a surprise. A lot of people don't think about their build pipelines, their CI/CD systems as production environments. You need to treat them that way. Just like you have audit logging in your databases with your billing information, put audit logging on your CI/CD systems. Use ephemeral environments, so if somebody does break in, or somebody does get an SSH connection into your machine, it's recycled. It's killed. You hit anew when starting fresh. Stop persistent threats and persistent attacks, by reusing everything, rebuilding everything from scratch constantly. The same thing with your credentials. You're going to get leaked. Scope your credentials. Time bound them. Don't use long-lived bearer tokens that are going to be a nightmare to clean up years later, when they leak. Scope everything as small as possible, principle of least privilege. Basic production hygiene, just think about it and apply it to your build pipeline. This prevents these incidents that are going to happen and are happening constantly to you at a big enough company from turning into a bad year.

A secret is defined as something you told to only one person at a time. Secret aren't secrets. Secrets that you keep to yourself forever are boring. They're going to leak. They're going to get out. I like the takeaway from this, anybody that pretends they can keep long term secrets forever, is lying to themselves. What does that mean? You will lose them. Every single YubiKey is going to get lost. Every single signing key that you use for your software is going to get leaked or compromised. If you plan for these events to never happen, when they do happen, you're in for a world of hurt. Figure out how you're going to revoke credentials. Figure out how you're going to rotate keys. Treat your keys like databases. Everybody knows the saying, backing up the database isn't what's important. It's the ability to restore when you have to. Secrets only account if you can revoke and rotate them. Figure out what you're going to do for every single secret you create. How long it's going to last. How you're going to get a new one. If you have a secret that's valid for a month, and it gets lost, and you don't detect it, that's a problem only for that month. If you have a secret that's good forever, and it gets lost, that's a much different situation.

When it comes to delivery pipelines, there's even a whole field of research and publication about this. The update framework is designed entirely around the principle of resilience and recovering from key compromise, recovering from key loss, particular for signing and delivering updates to packages, important package managers, and things like this. It's completely designed around this concept. Instead of trying to prevent secrets from being leaked, which still does, it's designed around how to recover from them gracefully.

Finally, the last takeaway, hope is not a strategy. It's a very common saying in SRE. If you pretend your database is never going to go down, that's fine until it does, when you're rushing, scrambling trying to figure out how to get it back up. The same applies in security. Just trying to prevent incidents without having a response plan is naive. It's going to happen no matter how good you are at preventing things.

All the stuff I've been talking about: playbooks, wargames, simulations, exercises, brainstorm what to do. Don't just brainstorm how to stop things, but brainstorm what to do after they do happen and how to recover. What's the worst case scenario for you right now? What's the most important secret at your company? What would happen if it gets lost or leaked or compromised? How would you recover from it? You can do this with your team. You can do this in one day. You can do this regularly, one day a month. We do it regularly at Google. We do it in open source projects I work on. Don't just try to prevent compromises, have a plan for what to do when they do happen.

Summary

Supply chain attacks are coming. Open source is particularly at risk. It's in everyone's supply chains. It's easy to get into because of the distributed nature of the work. Don't give up on hardening. The message here is not, stop trying to lock down your systems. The message is to expect compromises to happen and have a plan for them. That's the entire point of resiliency: coming back from failure, coming back from compromises, coming back from attacks. That's the important thing to take away.

Questions and Answers

Rettori: The first thing that comes to my mind is, I'm scared. Is there any hope? Should I move to a different industry?

Lorenc: Software was a mistake, I think that's what people keep saying.

Rettori: What is the hope that there is for us? We're seeing folks that are malicious people that are indeed just exploiting really open source for their own benefit, and all of our upsets. What hope is there for us in this?

Lorenc: There's nothing really new here, it's just a different place people are attacking. Attackers will always move around and find the easiest way in to do their jobs. People are attacking open source now. They're doing supply chain attacks in general now, because it's the easiest way in. We've finally done such a good job as an industry hardening all the other things and fixing up all the other table stakes. Stuff like HTTPS Everywhere is only really new in the last couple years. Strong passwords, two-factor auth. Stealing somebody's password from one of those, Have I Been Pwned dumps, which was way easier before, now we've finally hardened all of that. Doing these open source supply chain attacks is getting easier, relatively, compared to all these other mechanisms. I don't think there's anything too much to panic about. We're always as an industry reacting to what threats are out there and coming up with solutions. Sometimes we're a little bit slower than others. I think now we've seen enough of these big ones that we'll come together and figure out how to do this quickly.

Rettori: Then, if I look at overall supply chain, which is like all of the pieces that participate in the process of building my software, I think I often see that we do not take the chain or the mechanics that ships software with the same level of seriousness, that we treat the app itself. What are your thoughts on this?

Lorenc: The easiest thing is just saying it, like you just said. I'm guilty of it before. How many people are running Jenkins instances on your Mac minis or something under their desks at work? You'd never run an actual data center app or a server or a database that way, but for some reason, we thought it was ok to do that with our build systems before. It's not. It's just like all those movies and TV shows where somebody's trying to break in or break out of a bank vault, or a prison, or something like that, they hide in the food delivery cart. You've got to treat your build system the same way. It's like building a giant fence and then leaving the door wide open. Just acknowledging it, thinking about it. Taking it seriously, I think is really the biggest step.

Rettori: I've seen companies where even the security classification, or the availability classification of the build system and of the supply chain is classified internally as lower than the app itself. That doesn't make much sense. How do you treat something that builds what you ship with a lower level of security?

Lorenc: It needs to be at least as secure as the environment you're deploying it into.

Rettori: Provide some tips on how to protect from these attacks, and what to do if they actually happen.

Lorenc: How to protect from these attacks? First one, we already know how to harden systems. We know how to monitor for intrusion. We know how to put things behind firewalls. Just do that with your build servers, like we were just talking about. Lock down your source repositories. If you're using GitHub, set up branch protection and all that stuff you know should do but you might not be monitoring and checking.

What to do if they actually do happen. This is step one, and I see it as like the Maslow's hierarchy of needs in security, for stuff like this. Audit log everything. Log all of your builds. Log all the information. Log all the hashes of all of your tools. Log all of the versions. Log all of that stuff somewhere safe, so that if something bad does happen, you can actually search all of those logs. Figure out how effective you are. You don't want to be in a position of just finding out there are some Zero Day or something, and you have no idea how many builds, how many environments are affected by it. Just make sure you have the data somewhere you can search if you need to. That's the biggest way to recover from this.

Rettori: Open source software and the software itself, should they be treated as two different things?

Lorenc: Yes, two risk points. It's open source and it's software, is how I broke it down. I don't want the recommendation here to be, start using proprietary software. That doesn't really help unless you know and trust the people that are writing it. Just because you've had to pay somebody for it at a company doesn't mean it's better than the open source equivalent. Start paying attention to what you're using first, don't just npm install anything, and pull in 500 other dependencies. Be thoughtful about it, and figure out what you're using first. Then every time somebody wants to add something new, at least have a conversation about it. Make sure you take a quick look at it. That'll help you catch most of the least obvious ones like typosquatting, or dependency confusion, a lot of these attacks where people just never even looked at the name of the thing they installed. Then for software itself, test your dependencies, just like you would test your actual application. Make sure that all those code paths are followed. Make sure you're using safe things. If you can use memory safe languages, that'll help out a lot, too. You don't have to worry about a whole class of attacks.

Rettori: Tools and ideas for scanning vulnerabilities in the dependencies themselves.

Lorenc: There's a whole bunch of these. It really depends on what language and what framework, and all that stuff you're using. Some free ones for open source are Snyk, Trivy. There's a ton of them. It really depends, if you're scanning Docker containers, if you're scanning like a packaged JSON in a repo. GitHub even has a bunch built in if your code is there. Turn on Dependabot too, that's another great tip, if you're in a language or framework or ecosystem that supports it on GitHub. That monitors for not just vulnerabilities, just for new versions in general. It'll automatically send you a pull request that you can just merge to update. You want to make sure you're updating quick and frequently, even if you're not updating to fix a vulnerability, updating in small increments, like 1.1 to 1.2 to 1.3, is way easier than going from 1.1 straight up to 1.9 when you have to in a rush because there's a security vulnerability. Stay as close to updated as you can all the time.

Rettori: Any language ecosystem, which is particularly strong or weak, in terms of supply chain?

Lorenc: It really depends on the angle, like things that are in unsafe languages just have that as a huge disadvantage. It doesn't matter how great your package manager or scanner is. In C++, you have to worry about things that you don't have to worry about at all in something like Go or Rust. That's one way to compare. All the package managers today I think are relatively close in terms of the feature sets that they need. It wasn't the case maybe two years ago of being able to actually understand what version of the dependencies that is. Being able to lock all of that, being able to make sure that it can't change. I think I still see stuff in Python, where people tend to pin their dependencies a little bit less. You can do it. I wouldn't really classify it against Python, but there's always cases where people are using like the less than or greater than operators in their requirements.txt, and getting surprised when things update and break them. I think most of the tooling at least supports all the basic table stakes stuff you need.

Rettori: There's potentially a risk in updating the dependencies. What are the ways to fix or reduce the risk in the dependency chain down there that's not really as risky?

Lorenc: That's really what tests are for. I'm talking to engineers who get bored or sick of writing tests or feel like it's a chore. Don't think of it as writing tests to get your own bugs. Sometimes you can shift and think of it as writing tests to prevent other people from breaking your code. Your dependencies are just like somebody else on your team as they come in and stuff. Tests is the biggest thing. Then do it quickly in small increments wherever you can. Update just one thing at a time, as often as possible by automating it in Dependabot. If you do break something, because one package bumped a minor version, then it's usually a lot easier to narrow down and trace and fix all the conflicts, and update everything. Than if you're updating 100 dependencies across 50 versions each, just small increments, continuously.

Rettori: We talked a little bit about the dependency attack on the popular proxy. I think by now, maybe you can talk a little bit about that. The attack on Codecov.

Lorenc: Codecov is a popular testing tool to tell you your test coverage. They got hit with a supply chain attack. It wasn't really open source software people were using, this was part of a service. Their infrastructure got compromised, and it was used by thousands probably of open source projects on GitHub. All those projects now need to treat any secrets that they were using in their CI as compromised. If you were relying on one part of the supply chain to be secure, now you've got to trace this through the whole rest of the graph. Everybody using that service had to rotate secrets and do full audits. It's scary stuff.

Rettori: I've seen some companies that are starting to build everything from source, like anything open source that they use, they actually have it in-house first. They build everything from source, and then publish to their internal repos. It's not cheap to do this by any means, to build every single dependency from source. What are your thoughts on this? Could it be an alternative for this?

Lorenc: Yes, I don't think it has to be binary. I think there are a lot of benefits to it. Google's famous for doing a lot of that with our monorepo. We pull stuff in and check it into this special third-party directory and redo all the build systems, in most cases, to build using our internal stuff. It has a lot of benefits, but it also has a lot of problems where you may not be building things the same way the open source community does. You run into your own bugs and your own maintenance headaches, if you have to apply patches to get them to build the right way. I would encourage you to think about it, at least, if there's something critical to your application. You might want to be contributing back to it anyway. Then figuring out how to build that and distribute it yourself from source can have a bunch of benefits on top of just supply chain. You can think about it for, it will make it easier for you to apply custom patches if you have to, or make it easier for you to maintain your own feature set, if you can't get some changes upstream. It also makes you familiar with the development process and makes it easier for you to send patches upstream to fix bugs you find on top of that. There are a bunch of benefits. I don't think you have to look at it as a big boil the ocean effort of take everything from dependency managers or compile everything from scratch, you can look at it one at a time and figure out where it would make sense to rebuild stuff on your own.

Rettori: Application architects love to look at the security of the app, and sometimes forget about the security of everything else that shapes our apps. I think the resiliency of a system is that, is like, what are all of the pieces in the system of your app.

 

See more presentations with transcripts

 

Recorded at:

Jan 28, 2022

BT