InfoQ Homepage Presentations Keep Calm and Secure Your CI/CD Pipeline

Keep Calm and Secure Your CI/CD Pipeline

View Presentation

Speed:

Download

39:42

Summary

Sonya Moisset describes how GitHub Marketplace helped Pride in London automating and improving their workflow with different tools for accessibility, code coverage, code review, code quality, security and other functionalities (alerting with Slack). She talks about what OWASP is and how to improve the workflow for open source projects using GitHub Marketplace applications.

Bio

Sonya Moisset works as a Lead Security Engineer at Photobox Group. She is a Tech Advocate and a public speaker in the UK tech scene. She is also a mentor for women in tech, a cybersecurity writer for FreeCodeCamp publications and an active member of the tech community in London. Her motto is #GetSecure, #BeSecure & #StaySecure

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Transcript

Moisset: Welcome to Keep Calm and Secure your CI/CD pipeline. I'm Sonya. I wear dark hoodies, so I'm a legit security engineer, obviously. I want to start to introduce what is cybersecurity and why is it important. Cybersecurity is the technique of protecting computers, networks, programs, and data from unauthorized access and attacks that are aimed for exploitation. In 2016, there was a series of DDoS attacks that disrupted a lot of web services including GitHub, PayPal, Twitter. You could see something like that at GitHub, the pink unicorn.

Cybersecurity Attacks

Are some of you familiar with Shodan? Shodan is like the Google of IoT. In that case, I've put a very small webcam in the search field. This is what I could get. It's like open, just directly connected to the internet. Obviously, I've redacted the IP Webcam. What happened during DDoS? Basically, the attacker will go to this website and harvest bots because they're using a default credential, and then launch an attack against servers and exhaust the resources. Another example of cybersecurity attack is the WannaCry ransomware. Usually, an attacker will send a phishing email. The user would click on the email. On the background the malware will be downloaded and will encrypt the data. Usually, the attacker wants you to pay Bitcoins. This is the consequence of the WannaCry ransomware. 130 countries were hit.

Sextortion is another example of a cybersecurity attack. Basically, the little blur at the top is the password. The attacker will tell you I've put a malware on your machine and I can record everything you're doing, all the naughty things you're doing. If you don't want me to send it to your family, pay that Bitcoin at that Bitcoin address. Also, what happened, Ashley Madison was a victim of a data breach. Now there's a second piece of that one, it's actually the attacker taking advantage of that data breach to say, I've actually got your profile from Ashley Madison and you should actually pay some Bitcoin. Otherwise, I'll just tell your family.

Obviously, it's a scam. What I would recommend is, you can take this Bitcoin address, just have a look at the wallet and you would see that it's actually a scam, because no one is sending money to those wallets. Also, a good way to see if one of your email address has been breached, you can use a website like, Have I Been Pwned? Obviously, instead of I love QCon London, you put your email address, and you would see where your account has been compromised. This is just an example of some of the companies that had a data breach. Almost all the companies that we know had one. Also, this is just if you love analytics, this is another view of all those data breaches.

How Data Breaches Occur

Usually, an attacker will either do a little bit of social engineering to get information or they will look for weaknesses in the infrastructure. Once they find a weakness, they will infiltrate the network or your application. Exfiltrate the data. Then dump it where they can sell it, because data is something that can sell.

Example Web App Attacks

The first one is on the Donald Trump website. The Trump camp did something stupid. One of their developers actually embedded a piece of code on their website. This was the piece of code that the developer embedded. This tag was in the source code. If you look at the source code, you could see that. It was pulling directly from a developer called Igor Escobar. That's Igor Escobar. Basically, with open source, you can submit a PR, and inject arbitrary JavaScript code into Donald Trump's website. In that case, this was the website. If you were checking the source code, you could actually see the script. If the file on GitHub gets modified, it will get reflected within 30 seconds on the website. The repository was around enhancing the UX of submitting a form. This was the JS Mask plugin. If anyone submits a PR, then it will inject on the Trump's website. You could actually modify the DOM. You can challenge the visitor to install software. You can redirect them. What if you would actually redirect the user to Clinton's website? If you open your dev tools, obviously it's not going to reflect. You can do the test on Trump's website and put this little piece of code. Obviously, it will redirect to Hillary Clinton's website. What if actually you click on edit and raise a PR. The researcher actually demoed this. Obviously, he didn't raise the PR. This is how easy it is, actually, to raise a PR on the open source. The good news is the security hole was fixed in less than three hours. It was just bad publicity for the Trump campaign.

Another attack is crypto miner incidents. This one has been on several websites including the ICO, United States Courts, General Medical Council, Manchester City Council, also in Australia. Basically, more than 4000 websites were hit by this attack. Actually, it wasn't the websites themselves. It was a script that they were embedding on their source code. The script that has been tampered is from texthelp. Texthelp are doing assistive technology, which is good. It's for accessibility. One of their products is called browsealoud. That was the one that was actually tampered. It's a one, two, three step. You just copy paste this script, and you put it within your codebase. Now we're back to the Trump scenario. It's real in that case. Let's take this one, the ba.js. The obfuscated bit at the top is actually the crypto miner. What happens? Basically, the attacker managed to get access to where they were actually storing that file. The file gets distributed from their CDN, and now every single website has the crypto miner embedded in their website. Obviously, there's an impact for the user because it's slowing down the process.

How to Social Engineer Developers

Another example is how to social engineer developers. This one is interesting. In 2018, there was a package called flatmap-stream within the event-stream package. Basically, that malicious package has been downloaded 8 million times. You can imagine a lot of web applications had it. The event-stream is basically a toolkit for JavaScript to create and manage streams. It was authored by Dominic Tarr. This is Dominic Tarr. Basically, when they started working on this package, one of the users, Devious, asked if a flatmap functionality would be welcome. Actually, at the end of the issue, Dominic Tarr is saying if you publish a flatmap module, and then make a PR, I will merge it. This is actually a good sign for the attacker.

One developer actually created this package in 2018. Then the attacker called RIGHT9CTRL, approached them, and he said, "Let me do it. I'll just merge the fixes, don't bother with this." Dominic trusted RIGHT9CTRL, and gave him full access to the npm registry. At first, RIGHT9CTRL just pushed cosmetic changes just to build up the trust. Then he pushed the malicious code. Obviously, after the malicious code was pushed, a lot of users raised issues saying there was some slowdown on Nodemon and other dependencies. People started seeing that something was going on. The target was a Bitcoin wallet. That's the target, Copay. Now the website is down. It has been attacked. The strategy from the attackers was to wait for the right opportunity to be built within the Copay app. They've actually succeeded within this range of version. Since that incident, the repository for event-stream has been archived, as you can see at the top. RIGHT9CTRL's profile on GitHub has been pulled down. This is usually the type of message that you could see, for people that have to upgrade their packages.

What Else Can Go Wrong?

What else can go wrong when you push credentials on GitHub? GitHub is housing thousands of publicly accessible keys, tokens, and passwords. You can just do a quick search with API keys, and you'll see there's loads appearing in the search. That's just an example of API keys. When the developer realized that they've pushed the keys they try to change it, obviously it's too late because it's version control, so they will remove with exes. Or they'll just delete, or they will just put empty strings. Still the issue is the same. The API keys are still on GitHub. You could find keys for all of those services. That was great. Example with Google Maps, with Stripe production. Yes, great findings.

What Is Web Application Security?

Web app security is a branch of infosec that deals specifically with security of websites, web apps, and web services. You could also implement web app security within the software development lifecycle. The software development lifecycle is just a framework that defines, from start to end, the process used by the organization to build an application. This is usually the steps that you can follow coming from planning. You have the requirements. You prototype and design the product. Then you implement the code. You test. Then you deploy, and you maintain it. You can actually inject security at each of those steps, should it be penetration testing, code review, threat modeling sessions, architecture analysis. If you do this, security is a continuous concern, awareness of security consideration are also by all the stakeholders. There's early detection of flows. Obviously, there's a cost reduction because you don't want to fix the issues at the end of the pipeline.

How to Get Started

How do I get started? OWASP is a good start. Who is familiar with OWASP? OWASP stands for Open Web Application Security Project. They do have a new website that they've pushed recently. They also release a lot of documentation around, not only application but mobile, IoT, API. Everything is free. Some of the most well-known is the OWASP Top 10, and the OWASP Proactive Controls. The OWASP Proactive Controls is a list of security techniques that should be included in every software development project. This is the list of the Top 10 Proactive Controls. The first one is define security requirements. To help you, you can use another OWASP documentation called Application Security Verification Standard, which has also a checklist that will check against authentication, session, API, and all of those. You might be familiar with user stories, but why not also using abuse cases. This could be an example of an abuse case. If you are working on a broken auth feature, as an attacker, I have access to hundreds of millions of valid usernames. Obviously, you don't want those in your application. You can work around those abuse cases. Another one is to leverage security frameworks and libraries. Basically, you don't need to reinvent the wheel, but use secure coding libraries and software frameworks. A good tip, usually when you use those packages is to see how many contributors they have. How many commits. When they last pushed on it. Also, for example, for the JavaScript ecosystem on the npm registry, how many weekly downloads? To see if it's actually a popular package.

Is Your Application Vulnerable?

The other well-known OWASP is the Top 10. They usually release one every four or five years. That's the latest one, 2017. We're going to focus on the A9, using components with known vulnerabilities. How do you know if your application is vulnerable? Who knows all the versions of all the components? I'm not talking about the main one like React. If you don't know the version of all your components, should it be client-side or server-side? Your application can be vulnerable if you don't scan for your vulnerabilities, if you don't fix or upgrade your vulnerabilities.

How to Prevent It

How can you prevent it? I'll give a list of tools to actually help and support on that. One obvious thing can be to remove the unused dependencies, just do a little bit of inventory for both client-side and server-side. I'll give some tools that'll actually give this because obviously you don't want to manually do the inventory. Also, a good one is to obtain components from official sources. Be careful when you install a dependency to avoid typos. For example, if you do an npm install, React [inaudible 00:14:12] might be a malicious package. Who knows?

How to Secure Your Open Source Project

I'd like to present the Pride in London open source project. I'll show you the different tools that we're using within GitHub. The perfect CI/CD pipeline doesn't exist. Pride in London is an open source project with around 20 developers. We're doing the website and the app. We're using all of these tools and platforms. We also are using those tools from the GitHub marketplace, because they're usually free for open source, which is good. This is just an example with the security tools, you can have a play with. We've actually tried a lot and decommissioned a lot. I'll go on the challenges that we faced with some of them. Also, another example of all the GitHub apps that we've been using. Usually, what I would recommend when you work on the open source is just to give access to only selected repos, and they'll just give access to all your repos.

Gatsby, Contentful, and Netlify

We're using Gatsby to code and react. We're using TypeScript. That's the framework we're using for coding. We're fetching the information, the data from Contentful, just our CMS through GraphQL queries. This is just an example of where we put all our content model. We just build our scheme on Contentful. It's where we fetch the data. We have a lot of webhooks to have that automation through CircleCI, Gatsby Cloud, Slack, Netlify, all of those. This is just an example of one of the webhook. Usually, when you publish or unpublish, it will trigger those webhooks. You also have audit logs for those. We're using Gatsby Cloud for the build time. Usually, for every PR branches, it will just build your website. It's actually quite detailed on the steps that they're following. If you're curious to know what's going on behind the scene, Gatsby Cloud is quite thorough on that. We're hosting on Netlify. We have production deploys, and preview deploys on it. There's also thorough logs around it.

CircleCI

For the continuous integration, we're using CircleCI. Basically, for each time you're pushing code for a PR or a commit, it will trigger a CircleCI build. We've reached almost 9000 builds for open source projects, which is quite amazing. We have a couple of steps for CircleCI scripts. This is just to give you an idea around the dependency, if it's safe or not. It has to fetch it, running the linter, and running the test.

Codecov

We're using Codecov for the code quality. All of these tools are free for open source projects. You can actually use all of them. It will go through each of the PR and commits. Calculate the delta. Let you know if there's an increase or decrease on the coverage of your code for the tests. They will actually inject it through the PR on GitHub, so you could have this visualization, which is quite neat, on your PR page. Obviously, you can run all of your tests. They also have an HTML format. If you don't find it easy to navigate a grid, you can just use this HTML webpage.

Codacy

We're using Codacy for the code quality. First of all, there's a job. I've actually turned on all the rules on Codacy and tried to fine-tune for our project. Then it became flat. As we're actually moving to TypeScript, it became this hybrid application that Codacy couldn't actually cover properly. It was giving us a lot of false positive. We actually had to decommission the tool at the moment. Might probably come back to this tool after we have the whole codebase in TypeScript, but for now, it doesn't make sense.

CodeFactor

We're using another tool called CodeFactor, which is basically doing the same. Going through the PR and the commit, and let you know if there's things that you can improve around code quality. They will also give you some remediations and advice on how to write quality code.

DeepScan

Another good one specifically for JavaScript and TypeScript is DeepScan. Also, the same, go through PRs and commit, and let you know if there's any improvement that you can have. The good thing is quite far, it will actually tell you exactly where it is. It's not just the block of line. The explanations are quite good on that one.

LGTM

LGTM is one tool that has been acquired by GitHub recently. Also, free for open source. Same principle, going through the PR and your commit. It will give you delta, with also explanations around that. They will also link to the official documentation. In that case, because we're doing a React project, at the bottom, you can see a link to actually the bit that you can improve. In that case, it was state and lifecycle for React. You can also compare against other open source projects to see how good you're doing or how bad you're doing. Another one is just to size your pull requests. It will actually put the little label when you raise the PR if it's a small, or a large, or XL.

Datree

For compliance, we're using Datree. What is good with this talk is usually for open source, those are some areas that you might probably skip around having code owners, ensuring that you have a proper gitignore file. That you're doing a little bit of cleanup with levers and [inaudible 00:20:44] within your project. That you're not pushing keys within your codebase. We'll flag all of those. If it's not compliant, it'll be red actually.

GuardRails

Another security tool that we're using is GuardRails. This one will actually scan your codebase and let you know if there's any API tokens or credentials that you might accidentally have pushed on your codebase. It's saying, we'll just scan every PR and every commit, and let you know if there's something that you can fix.

Sonatype DepShield

For the packages, we are using several tools. Sonatype DepShield is one of them. Sonatype is raising issues. It's quite thorough. For something that you use with security can be quite overwhelming sometimes because they're checking against CVSS score. If you're not familiar, you can have just a view on how they calculate the score for the vulnerability.

Dependabot

I would recommend using Dependabot. Dependabot has been acquired by GitHub, so it's natively integrated. That's pretty good. You can either configure it through their online platform, or you can use a config YAML file, if you want to target specific branches or specific languages, all of those. Dependabot will automatically raise a PR telling you to which version you need to bump your version. If you have an integration with Slack, it will actually feed through your Slack. The good thing is the compatibility score. Basically, they will compare against other open source projects, and they will increase or decrease this score if it doesn't break their pipeline. It gives you a little bit more confidence. To have Dependabot on GitHub, you actually need to enable dependency graph. Dependency graph will allow you to actually scan those dependency package files, and let you know if there's any vulnerable packages. It's not only for JavaScript, they actually cover a lot more languages. It's basically enabled by default on public repos. If you want to enable it on private repos, you need to go to your repo inside dependency graph and allow access. Then you can use Dependabot after that. Also, GitHub will send you regular emails, depending on your settings. That you'll have security alert digests, when there's any vulnerability that you have in your codebase.

Snyk

Obviously, there's also Snyk, which is free for open source. Also, a really good tool for scanning your third-party dependencies. They do have an online platform, where they will let you know when they have a vulnerability, what is wrong, and how we can remediate this vulnerability. There's a Slack integration as well. You can fix also through the CLI. They have a CLI wizard. They can help you create those Snyk policies to improve your packages. How does it work? It actually works through webhooks. With Slack, you can do integration through webhooks. Usually, this platform, they have one step integration with Slack, and you just have to put the URL so you link the tool with Slack. Then you can actually start feeding specific channels around that. Also, obviously, they will send you emails to let you know if there's any vulnerability.

Rollbar

For monitoring, this is specific for JavaScript. I think there are other languages. Once your website is live, it's also good to see what errors we could get. Rollbar is taking top 10, top 5 errors, and lets you know with graphs, what are the most common errors that you could get on your website? You don't have to go to the dev tools to check against it.

How It Works on GitHub

How does it work on GitHub? I've shown you a lot of tools and you might wonder, how does it show on GitHub? Basically when you raise a PR, Codecov is actually integrated within the PR. If you scroll to the bottom, you will see all of those checks. It's like a pipeline with all your tools. You also have the other typical checks. This will depend if the tool is natively integrated through GitHub. You'll also have more information around that tool. Just to give you an idea for the compliance for LGTM. For CodeFactor, they'll give you more information. You don't need to go on their platform. Everything is centralized on GitHub.

What we've done at Pride in London is have a specific Slack channel where we could actually feed all of those results. When we have some deployment from CircleCI or a deployment on Netlify, we'll have results from RollBar, or GuardRails, or Snyk. We have specific channels for each of them. Obviously, we have the tech, GitHub which is the one that actually feeds all of the steps. Actually, in real-time, we let you know if there's one failing or in process. It's quite good. You can stay in Slack and have this monitoring and follow your process. Obviously, because it is open source, all of those platforms have these little tags and badges that you can actually put in your README file. It's also good when you're doing open source to just showcase that your code has some standards, and people can come and contribute to that.

We've also been using the GitHub board. We've been implementing all of those labels around epics and categories. We have different epics, the same as you could find on Trello or Jira. We've worked through epics and stories. We're using the Ultimate Kanban. It's linked with your PRs. Once you raise a PR, it will actually move ultimately on the colon. When it's in code review, and when you merge and close, it will automatically actually move through colons. It's quite cool to see it. Also, it's a good way, if you want to have a proper project, when you're doing open source for the different maintainers that you have, you can assign tasks and have the proper project management that you'd find with Jira or Trello.

Also, in terms of security, basic hygiene on GitHub. By default for Pride in London, we record 2FA for all of the developers. It doesn't matter if it's open source and if they're maintainers, they have to have 2FA enabled. All of the base permissions are none. We've created separate teams. Then the maintainer will be effected to one of the team. Then they will have read or write access, depending on which project they are working on. The master branch has been protected. Obviously, because we don't want people to be able to delete the branch. We also have two or three reviewers for PR. Because we've added all of these tools, we're making all of these tools required, and make the build fail.

Features - Content Security Policy and Subresource Integrity

I wanted to introduce a feature that could actually prevent the cases that I introduced earlier with Trump and the crypto miner. Who is familiar with the Content Security Policy? The CSP is actually a header that acts as an added layer of security. They will help you mitigate injections of the type like cross-site scripting. Basically, you could actually whitelist assets per page. If it's not on this list, the browser won't load it. You have very good documentation around the CSP from Mozilla. There's also the contentsecuritypolicy.org. They have a specific website where they actually give you the list of directive, and the value that you can assign to it. Obviously, there's different type of assets that you can implement with your website, should it be images, fonts, media, scripts, frame. You would have all of the values. They will also give you examples of CSP. Usually, what I would recommend is not to integrate the CSP straight underway. It's just to, probably, turn on the report only at first so you can have time to fine-tune your CSP, because you will get a lot of errors. You can use Report URI to actually digest those errors and fine-tune your CSP. Then when you're confident you can push it live.

Who is familiar with subresource integrity? Usually, when we talk about CSP, we also talk about subresource integrity. That's another security feature that enables the browser to verify if a resource hasn't been tampered when it's fetched. Basically, when you're copying the script, it should be a jQuery bootstrap or any script that you're using. They have this other option with the SRI. Basically, the SRI would be this integrity checksum that you have at the end, and that it will prove it's coming from the official source and it hasn't been tampered. If your company is producing script, you can actually generate this SRI, and just append it to your own script. It's basically covered by most of the recent browsers. Also, good documentation on Mozilla on that.

More Tools - Online Scan

Just to finish on more security tools that you could use, or just to improve the quality or the performance of your website that we're also using at Pride in London. Who is familiar with webhint? It was called Sonarwhal before. Basically, you would just put the URL of your website and it will scan your website and tell you areas where you can improve. Should it be accessibility, progressive web app. Not only security, but you can improve on different areas on your website. The good thing is the explanations are quite thorough. They will give you remediation on that. It's free. It's good.

I think you might also know Lighthouse from Google. Basically, how you can trigger a scan. You just open the dev console. You go to the audit tab. You can launch a scan, should it be on mobile or desktop, or one of those categories. It will also give you areas where you can improve. Another one, PageSpeed Insights. This is more on the performance side. Another one called Uptrends. Also, they can give you diagrams and visualization of where you can improve for your website. In terms of security, there is a good one like Qualys around certificates, if you're not sure for your open source project. Also, Security Headers, if you want to check the headers that you've actually implemented within your project.

Key Takeaways

Open source can be a vector for large scale cyber-attacks, as I shown with the Trump and the crypto miner. You could see, actually, the impact on 4000 websites or 8 million applications. The good thing is if you're hosting on GitHub, you can leverage the applications that are available on the GitHub marketplace. You can also start creating a small pipeline. Obviously, you don't have to use all of those tools to make sure that you have a good pipeline. You can also harden your GitHub security around 2FA, around creating teams, and allocating your maintainers to those teams for your collaborators. Experiment, because all of those tools are free for open source, which is just experiments. For Pride in London, we've experimented with a lot of them. We've decommissioned also quite a few. For example, for Codacy, because we've moved to TypeScript, it didn't do the job. We had to move to another tool, but we might come back to this tool later on. Also, don't push your keys on GitHub. That's an obvious one.

Resources

I would recommend The State of Open Source Security Report from Snyk, as a good read. Also, the blog posts from the Snyk website. They're doing really amazing pieces, also around the event-stream events. Another good one, the report from Sonatype also around the software supply chain. Troy Hunt, who's an Australian security researcher also writes a lot around CSP, SRI, and all of those tampered scripts. Good resources that you can have a look at. If you're interested in security, I wrote a piece on Medium where I gather podcasts, YouTube resources, also around OWASP. If you're interested in security but you still want to stay in development, you can have a look. We'll just finish with my motto, "Get secure, be secure, and stay secure."

Fuzzing Tests

Moderator: Do you use fuzzers, like fuzzing tests? Because, in our case, as we do a lot of cryptography we really rely on those automated fuzzer tests, just an automated software that tries different input types of data, like random data. Do you use them as well?

Moisset: Yes. I do some penetration testing on the website. I'm using tools like OWASP ZAP. It is a fuzzing tool. Yes, we do that exercise. It depends on the feature that we're delivering. Usually, just to cover that security side, we do Pen tests, internally.

Questions and Answers

Moderator: Do you have some nightly tests? For example, long integration tests that you run every night because they're too long.

Moisset: For now, we only have unit tests. We don't have the end-to-end integration tests. This is something that we want to explore. For the time being, we only have unit tests.

Participant 1: In terms of web application security, do you think the wide adoption of WebAssembly poses a security challenge? We have all these malicious JavaScript, how do I audit the malicious WebAssembly now?

Moisset: I don't know if at the moment on the market, we have tools that might scan, so probably rely on that at first. We could also do security code review on the codebase.

Participant 1: Do you think we need something more for WebAssembly, or the evolution of current tools and approaches should be enough?

Moisset: Probably. Yes. It's the same as for any web app. You would also do some piece of education for the developers so it's not only relying on the tool, but it's also having that piece of education. It's more like shifting left for them, so it's more around doing threat modeling sessions, or doing some code review, or architecture design and see how they will implement the code. You have both the tools and the education.

Participant 2: Is there a tool for scanning the artifact that you already deploy, when there is a new CVE that came up after the pipeline is finished?

Moisset: Yes. I think Snyk is doing it.

Participant 2: It's an asynchronous, continuous check of the dependency in your application?

Moisset: Yes, it will actually go through your package, dependency file, and will let you know if there's any new vulnerabilities. That's continuous.

Moderator: I believe that Snyk is one of these companies and there are more like that, like Vital Software. Sometimes you can rely on GitHub itself. It will send you all these alerts. There is an industry of scanning and open source tools. It's already large. We have a lot of things to choose from.

See more presentations with transcripts

Recorded at:

Sep 23, 2020

Sonya Moisset

InfoQ Software Architects' Newsletter