Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Presentations Shifting Left with Cloud Native CI/CD

Shifting Left with Cloud Native CI/CD



Christie Wilson describes what to expect from CI/CD in 2019, and how Tekton is helping bring that to as many tools as possible, such as Jenkins X and Prow. Wilson talks about Tekton itself and performs a live demo that shows how cloud native CI/CD can help debug, surface and fix mistakes faster.


Christie Wilson is a software engineer at Google, currently leading the Tekton project. Over the past decade, she has worked in the mobile, financial and video game industries. Prior to working at Google she led a team of software developers to build load testing tools for AAA video game titles, and founded the Vancouver chapter of PyLadies.

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.


Wilson: I'm Christie Wilson, I am a software engineer and I lead the Tekton project at Google. Throughout my career, I've worked on a lot of different stuff. I've worked on AAA games for a really long time, I worked on mobile, I've worked on foreign currency exchange, but somehow no matter what I work on, I always end up gravitating towards working on CI/CD. I think it's because it's everywhere, everybody needs it, but there's also always so much opportunity to make it better and I think that's particularly true with cloud native technologies today.

What I want to talk to you about is how cloud native CI/CD can make it easier for all of us to make mistakes. Sometimes, as engineers, it feels like we're not supposed to make any mistakes. We even have special terms for the engineers who don't make mistakes, we call them rock stars, we call them ninjas, we call them heroes. I remember once discussing a hiring decision with my manager and he said, "We were looking for a golden unicorn." I still don't know how to interview for a golden unicorn. Who invented Flaming Hot Cheetos? It was a janitor. You never know who's going to have the really good ideas or who's going to make the biggest changes or who has the potential to but can't because we've told them that they can't make any mistakes.

It turns out that you can't succeed without failing. Every success is built on all of the mistakes and failures that came before it. If you think about how science works, every successful scientific theory was built on the top of a whole bunch of disproven theories that came before it. If you want to succeed, you have to be able to fail.

What I want to convince you of today is that in 2019, we really need to raise the bar on how our CI/CD systems help make it cheap and easy for us to make mistakes and fail and how they can make it easier for us to deal with all of growing complexity that comes with cloud native.

This is an overview of where we're going today. We're going to start by laying the groundwork so we're all on the same page about cloud native. Then, we're going to talk about what cloud native CI/CD should be. From there, we'll talk a little bit about shifting left and what that means in this context. Then, I'm going to talk about the open source project that I work on, Tekton, and give you a little demo and finish up with what's next for Tekton.

What Is Cloud Native?

I'm going to start with some broad definitions. I'm not going to go into a lot of detail, but first, what is cloud native? If you ask five different people to define it, I think they're all going to say something slightly different. This is an attempt to break down the CNCF, Cloud Native Computing Foundations, definition of cloud native. It's microservices that running containers, the containers are dynamically orchestrated to optimize resource utilization. For a lot of us, what this ends up meaning is that we are running containers on Kubernetes. That is not the only way to be cloud native, it's just how a lot of us are doing it and it's the context that I'm going to be assuming for this talk.

You package your software and images and then you use Kubernetes to manage the execution of the images. Even if you're not familiar with those terms, just hang on, because I'm going to give you some very broad definitions that can help you at least maybe not use them tomorrow, but at least understand what I'm talking about in the rest of this.

What is an image? What's the container? An image is a binary with all of the dependencies that it needs to run. When you run the image, you call it a container. Containers all run using the same operating system but they run as isolated processes. That means you can do something like docker run your image and it just runs, where the equivalent without images and containers is that you have to install all the dependencies first and then maybe you have to use something like Python to actually invoke the thing you're trying to run.

Then, what do you do when you have a lot of images? That's where Kubernetes comes in. Kubernetes lets you orchestrate running a whole bunch of different containers and then even better than that, it also lets you not worry about the hardware that they're running on. You can say something like, "This image needs this much memory," and Kubernetes will just run it and you don't have to really know where it ends up. The non-cloud native equivalent of this is actually like a lot of different things. It could be an entire team of people who know how your data center works, they know where the configuration lives, they know how to configure your pod, there's maybe a bunch of Wikis that explain how everything works, all of that bundled together is what Kubernetes gives you.

Now, I'm going to go over a few Kubernetes things that you might hear me mention. Two of the most common Kubernetes abstractions you'll encounter are node and pod. A node is the machine that you actually run stuff on; it could be a physical machine or it could be a VM. Then, a pod is something that doesn't really exist outside of Kubernetes. It doesn't directly map to anything, it's just logically a bunch of stuff you want to run together or a bunch of containers. They all run on the same node, though, so they all have access to the same disk, so pods run on nodes.

Lastly, I want to give you a heads up, but I'm going to mention YAML a lot. You may know what YAML is already, but if you haven't worked with it, you might wonder why I'm going to mention it so many times. If you have used cloud native, you'll know that YAML is the configuration language for cloud native software, I don't know why. Everybody loves to hate it but anyways, there's a lot of it.

Something you might get a sense of already is that there is a lot going on here and it's fairly complicated. It's more complicated than what we were doing before. To try to prove that to you, I want to tell you about the first software system that I ever worked on professionally. Everything was written in Perl, it use the model view controller framework, which I remember being intimidated by at first, and it connected to two servers. There was a user server that held data about users and there was a MySQL database that I could actually run on my own machine.

I don't even know how releasing that software worked, that was not part of my life at all. What was part of my life was connecting to live production servers to try to debug problems, which was scary. I remember once somebody sitting next to me had a cold and he dropped a Kleenex box on his keyboard and he dropped a table in production. That's what he said happened, I think he just made a mistake. It was a time when things were scary, they were risky, but they weren't really that complicated.

Now, what the cloud native applications look like? One application could be made of a whole bunch of different microservices, all those microservices might be written in different languages. To deploy them to a system like Kubernetes, you have to understand how things like pods work and how to configure them. Then, when you start growing all this configuration, you start applying templating to it to try to make it simpler, then you need the services to actually be able to find and discover and connect to each other and it just keeps going. This means that you could start off trying to make a web server and suddenly before you know it, you're deploying and managing Kubernetes, Knative, Istio, Helm, Spinnaker. It just keeps going and it is a lot to keep up with and it's constantly changing.

What Is Cloud Native CI/CD?

That might be a little bit of a downer but what I want to do is look at why that complexity is all worth it and I think one reason is because it can give us cloud native CI/CD. First, when we say CI/CD, what do we mean? I was thinking for a long time about what it actually means, Continuous Integration and Continuous Delivery. I think it's fairly easy to intuitively guess what continuous delivery is because you're delivering something, it's like how you publish your images or your binaries or you deploy to production.

What does it mean to actually integrate continuously? It's literally about taking code changes and putting them together frequently so you make sure that they work. It's the opposite of the time in university when I was working with another student on a project and we both had one half of the project and then we waited until 8:00 p.m. the night before it was due to put the two halves together. They did not work, that was a bad night.

These days, the term continuous integration has grown beyond that to also include the fact that we are applying automated testing along the way to make sure things actually work.

It's a critical part of your software supply chain. You could even view it as maybe the conveyor belt that moves all of your artifacts through the supply chain, building them, testing them, deploying them, and then ultimately getting them safely into production.

Is that what cloud native CI/CD looks like? Yes, but we just talked about how complicated cloud native is, so does that mean that it's more complicated? Maybe, but it is worth it because of what it can give us. This is what cloud native CI/CD should look like. It should be serverless, it should be built on open specifications and standards, the stuff that we build with those standards should be reusable, we should be able to run it wherever we want to, and we should treat it with the same software best practices that we treat the rest of our code.

Let's break that down a little bit, starting with what it means for it to be serverless. This is another term that has a lot of interpretations. When I'm saying serverless in this context, I'm talking about the fact that you can run your workloads and scale them up and down without having to be too involved in the process.

If you're running on prem, then when you want to run something, you have to know that there's an actual physical machine behind it that you're going to run it on. When you use the cloud, you don't have to worry about that and you can request resources when you need them. You shouldn't have to worry about what operating system is running on that machine, what version of the kernel is installed. Serverless here means that all of that is taken care of for you, which has some pretty dramatic implications for CI because if you have been using any of the CI systems that have existed up until this point, you know that a lot of them have two main ways that they're designed.

Either they have the extra complexity of having some like master and a whole bunch of worker nodes that that master has to manage and distribute work to and communicate with. Or, more likely, there's just one master and that's where everything is executed, which means that if you have a bunch of teams all executing stuff on that master, they can actually interfere with each other, jobs can cause other jobs to fail, they can starve each other. With cloud native CI, if it's serverless, then we can avoid all of that and we can just scale things up and down as we need them and run them in isolation.

Why open specifications and standards? In my opinion, Kubernetes is pretty cool on its own but what I really like about it is all of the standards that it defines. If I'm deploying to a Kubernetes-based system, then I know that a pod is a pod. I know that if a container is going to run, it's going to be inside a pod. I know what a pod looks like, I know what attributes it has and all the Kubernetes configuration is declarable and extensible. This means that if you build a system on top of Kubernetes, then you don't actually have to give me an explicit plugin mechanism because Kubernetes itself does. Kubernetes provides ways for me to hook into the lifecycle of a pod and actually make changes to it before it runs, so it's this platform of infinite extensibility without you as the platform developer having to worry about it. This is how systems like Istio work and even better, if you're building on Kubernetes, then I can use all of my favorite Kubernetes tools to interact with it.

That brings us to the next thing that cloud native CI/CD should give us, which is reusable components. If we do the same thing as Kubernetes and we build on standards, then we can start sharing and reusing what we're building. This means that we shouldn't have to keep writing the same Slack notification plugin over and over again, we should be able to just write it once and then everybody can use it. Then, everybody can focus on this stuff that actually gives their company business value. If we build our CI systems like this, this means we should be able to mix and match pieces of them and we should never have to get locked into any particular CI vendor.

Now, let's finally settle the question of how to have parity between our development and our test systems and production. If you're using Kubernetes, a pod is a pod is a pod. If you can deploy to production Kubernetes, then there should be some way that you can get a hold of the configuration that was used for that and with a few tweaks, you can run it yourself on your own cluster. For the first couple years of one of my jobs, everybody on my team developed against the same instance of one of our key services. It ran on my manager's VM because he was the only person who ever managed to get it running a couple of years before that, because no matter how much time the rest of us spent going through the Wiki page that was supposed to describe how to set it up, it just never worked.

This is a different problem now. It's a matter of, "How do I get a copy of the YAML configuration and which values can I change safely without losing anything?" Compared to what it used to be, which was, "What version of what operating system should be installed? Did I install the right version of this package?" All of that. This is one of the things that makes all the extra complexity of Kubernetes really worth it. Sure, it is really painful to actually have to write all this configuration but once you do, it will work and then you can use it again and again. Suddenly, in the images and the config, you have everything you need to make the same production cluster every time.

Speaking of writing everything down, let's treat our CI/CD configuration the same way that we treat the rest of our code. Maybe we don't like YAML, but when I work with systems that are configured using it and I want to know how they work, I can actually look at the configuration. I don't like it, but it's there. I don't have to attach myself to a running process and inspect the system calls to see what's actually going on, I can just look at the configuration.

As our systems grow more complicated, it's really important that everybody who's interacting with them be able to understand what they're doing and look at how they're configured. To bring this back to the idea that we started with, that we should make failure as easy as possible, when things do go wrong, it's really important that all the people involved be able to actually look at what's happening and debug it.

It turns out, the debugging is all about learning. If you already knew how something worked, you wouldn't have to debug it, you would just know what the problem was. Debugging is an act of gaining new knowledge, it's a kind of learning. The better you are at learning, the more effective you can be and the faster you can deliver value as an engineer.

I really liked this tweet, "The most important skill to have as a programmer is the ability to teach yourself new things effectively and efficiently. You're going to be constantly growing and picking up new technologies. The ability to do that is more important than any individual tool or technology." I think what Ali Spittel is saying there is that the ability to teach yourself new stuff is maybe the best skill that you can have, especially as movements like cloud native make engineering more and more complicated. Chances are it's only going to get worse from here but if you can learn, if you can debug, then you can keep up.

How do we debug? We debug by looking at something, by reading it, by trying to understand it, by making little changes and tweaking it and seeing what happens. I once went to a lightning talk where the speaker advocated for being a trash panda, which if you're not familiar, is another name for a raccoon. The idea was that you can learn a lot by just digging through all of the data that's available to you. This is the main way that I learned when I started on a new project. Once I need to go beyond the documentation that's there, I start looking for the CI configuration, for the Docker files, for the scripts that actually run the thing because I want to see how it works.

What Is Shifting Left?

Those were the attributes of cloud native CI/CD. Why I'm so excited about this is because it makes it so that even though we have all this extra complexity, we can shift left. How many people feel familiar with what shift left means? Ok, not too many people. Who feels like they're actually doing it? A couple. If you're already sold on this, which is actually maybe not too many people, you can just feel really good during this next part. If this is new for you, then I am so excited to introduce you to it because this will save you actual money.

This is what software development used to look like and still looks like. You start with some requirements. Hopefully, a lot of the time, you don't even have the requirements. You design something, you implement it, then you test it. This could be the person who wrote the software testing it or a QA person or a QA department or some mix and then you deploy it to production.

You even see this a lot when you see how people are breaking down work for themselves into issues. You'll see an issue that's like, "Implement the thing," and then they write another issue, "Test it." There's some big problems with this and one of them is about how expensive it is to fix a problem depending on where you find it. It turns out it's way cheaper to fix problems if you find them before they get into production. This isn't even accounting for money you might lose because of the bug itself, this is just because you have to redo all the previous work. You have to get the thing out of production. Obviously, you didn't test something right so you have to fix that. You have to fix the implementation, maybe there's something wrong in the design and the requirements that you have to revisit, it's really expensive. If you find the problem while you're working on it, then it's cheap and fast to fix.

This is where shift left comes in. It's moving left in that whole workflow and testing earlier in the cycle. Ideally, you're even doing some form of testing before you write the code but definitely way before anything gets to production. Part of shifting left is changing the shape of what our software development workflow looks like. Suddenly, design, development, and testing are not quite as distinctive phases, they're all happening at once constantly. Shift left assumes that there will be problems, but the sooner you find them, the cheaper and easier you can fix them. If CI/CD helps us find failures and shift left says we should find them earlier, then I think that CI/CD should help you find failures earlier.

We started looking at cloud native and how it can be more complicated. What does this mean for shifting left? It means that if we don't have cloud native CI/CD, if you can't have CI/CD that serverless, infrastructure agnostic, config as code, then people are just going to give up and test in production. Or, they're going to create giant staging environments that everybody has to deploy to and test against before it gets to production. This would be a huge step back for shifting left, so that's why we need cloud native CI/CD.

What is Tekton?

Now let's talk about a project that's all about making cloud native CI/CD happen. It's Tekton. This is the project that I work on. I've working on it for the past year and I just get more and more excited about it all the time, not just because it has an awesome cat logo, but because it is cloud native CI/CD. All this stuff I was talking about up until this point, this is all at the heart of how he designed Tekton. Even better, it's not just open source. Earlier this year, we actually donated it to a new foundation called The Continuous Delivery Foundation, or CDF. The CDF is all about working in the open to take continuous delivery and the continuous integration that comes before that into the future. Tekton itself is being created with a bunch of different companies. We work with CloudBees, Red Hat, Salesforce, IBM, Puppet, and more. Ok, so what is CI/CD?

Who is familiar with the term, "The porcelain versus the plumbing?" The idea is, if you were looking for a toilet and you just found the plumbing that was underneath it, then you'd be really sad because you actually need the porcelain of the bowl and all of that user interface on top of it to use it. You need the plumbing to make it work, so you need both but you can't have just one of them.

In this case, Tekton is the plumbing, it is not the porcelain. If it's the plumbing, then what users is it good for? It's perfect for people who are building their own CI/CD systems. This could be people who are making CI/CD products like Jenkins X, or it could be teams that exist in companies where you have to deal with whatever your company's particular CI/CD needs are and make that work. It's also great if you just want to have a really custom setup. In the future, we want to provide a catalogue of really high quality, well-tested tasks or components that you can use in your own CI/CD systems, but that's something that we're working on early next year so I would say that we're not quite there yet.

How does Tekton work? Let's do a quick overview. First, I need to introduce you to one more Kubernetes concept. This is something called the CRD, or a Custom Resource. Out of the box, Kubernetes comes with resources like the pods that we were discussing before but it also lets you define your own. You define the specification for these resources and then you create a binary called the controller that makes them actually happen. Let's look at the CRDs that Tekton has. The most basic building block is called a Step. A Step is an image or a container, it's the image and the arguments and the environment variables and all this stuff that you need to make it run.

For our first new type, we created something called a Task. A Task puts together Containers or Steps and lets you execute them in order. So, a Task execute Steps in the order you declare them in and they all run on the same node, so they all have access to the same disk. You can combine Tasks into another new type called the Pipeline. A Pipeline lets you put the tests in any order you want so they can run sequentially, they can run concurrently, you can create really complicated graphs. They all run on different nodes but you can have outputs from one Task that are passed as inputs to another task.

Those two are the basic two pieces, the Tasks and the Pipeline. You define those once and then you use them again and again. To actually invoke them, you create something we call Runs, so there are TaskRuns and PipelineRuns, which run tasks and pipelines. Then, at runtime, we provide them with data, which is our last new type of PipelineResource. Ok, so there are five types. There are Tasks that are made steps, there are Pipelines that are made of Tasks, and then you actually run those with TaskRuns and PipelineRuns, and you provide them with data with PipelineResources.

As a quick aside, I'm really skimming over this but one of the cool things about using PipelineResources to represent your data is that it gives you typing throughout your CI/CD system because, like an example of a PipelineResource, might be an image that you've built, or it might be the Git source code at some particular commit. This gives us increased visibility because we can start looking at these artifacts as they move through the software supply chain.

The next thing you might ask is, "Ok, I get that, I see that if I want to run a Pipeline, I have to make a PipelineRun, but how do I do that? What if I want to, say, every time you open a pull request against this Git repo, I want to run a pipeline?" That is where our newest project comes in. It's called Tekton Triggers and we just had our first release. I won't go into too much detail, but Tekton Triggers has a bunch of CRDs that let you express how to create a PipelineRun from an event so you could do something like, take a particular Git commit and then you can run a Pipeline against it.

This is why that is all worth it, though. This is how Tekton provides cloud native CI/CD. Everything is serverless. Besides those controller binaries, nothing is actually running until you need it to run. The Tekton API is a set of specifications that any CI/CD system can conform to. Because we're building this all from containers, wherever you run a container, you can run Tekton. Lastly, with all those types that we looked at, these are components that can be reusable that you can define and commit right alongside your code.


Speaking of config as code, I'm going to give you a quick demo of how that can work with Tekton and how we can use it to debug. This is pretty exciting. I think I spend most of the time preparing for this talk trying to get this demo to work, so fingers crossed. What we're going to do is we are going to take my beautiful catservice, this is my catservice. It has information about my cat. She's really old and she is grouchy. Let's say that I was a new contributor to the catservice. The code lives here in GitHub and let's say, as a new contributor, I don't know how the whole process works but I just have some changes that I want to make, so I'm just going to go for it. What I am going to do is, this is the source code that I checked out, I'm going to make some changes. When I'm changing it, I change the image.

I realized that there's calculation that's happening. You're going to learn a lot about cats right now because I like cats quite a lot. It's converting a cat's age in human years to cat years and there's a really complicated equation that I'm using here and I feel like it's not right. This cat is 17 years old, that's got to be way older than 55 cat years. I don't really care, I'm going to delete all of this. I'm going to return something that I think is more reasonable, so every year is seven years.

I'm going to commit this and pick a branch. These are all my changes. I'm going to push my branch and I'm, "You definitely want my changes." I'm not really being the best contributor since I didn't really have a very descriptive commit message and also, I just wiped out some stuff. New branch, make_it_better. Open my pull request, there we go. I create my pull request and then, as soon as I create this, Tekton is going to kick off and start doing something but I don't know what because I'm just a new contributor. What I can do is I can actually start poking around and I know that this Tekton folder here has all the configuration for all the CI in it.

Let's say I'm looking at this and I can actually take a look at exactly the Tasks that are running and the Pipelines and I can even run them myself if I want to. I can take exactly these commands and I can run them against my own cluster, so I can apply these to my Tekton instance that's running in my Kubernetes cluster. Let's see, how is this going? Still running. "Configured, configured," ok, so I applied all that in my cluster. Ok, so the test failed, I'm not really sure why. One way that I can investigate that, though, is I can actually run that whole same Pipeline myself in my own cluster.

I applied all the configuration for it and I'm going to use the Tekton command line tool to start the Pipeline and is going to run against this particular commit that I made here. Time for some magic, copying and pasting. I need one of those PipelineResources that I mentioned, so I'm going to make new ones, it's going to be "my-branch." Then this is where my code lives, a little bit of that there, and what revision am I going to run it against. I'm going to run it against this revision.

Now, at once, this "pullRequest resource," but it doesn't matter because we're not going to update the pull request. Ok, now it's running. What I mentioned earlier was that you should be able to use all the same tools that you use for your regular Kubernetes stuff against something that's built on Kubernetes, so let's take a look at how that could look. I'm falling the logs up above but meanwhile, I can still investigate this with Kubernetes, I can get pipelinerun, get this pipeline run. There it is, I can start looking into the nitty-gritty of it and there is a pod underneath that's being run.

If you've used Kubernetes before to get logs, you have to get the logs from the pod, so I can do something like "logs" and that's the pod, but I need to know exactly what container I ran so I can grab the container and there's going to be some logs from that. Anyways, this shows that you can use the Kubernetes tool against it but meanwhile, we've got these other tools that are built at a higher level that make it easier to interact with. Actually, look at that, there's a test that's failing right here and I'm able to reproduce that and I can actually see exactly the command that's being run.

I'm thinking that maybe if I actually run this locally, I've got the same failure. It turns out in cat_test.go, there is some tests that I didn't realize existed, so I can open that up, I can fix it. Let's see, human years got 61 expected. Yes, these are totally inaccurate, that's way older in human years. I can fix all of those and then I can fix that. Then we're going to get that over here and the checks are going to get kicked off again. Meanwhile, I could start investigating what was actually happening. I started this pipeline, I can go and look at the configuration for that pipeline, I can start seeing exactly what's happening.

There's a unit-test task, I could go look at the unit-test task. If I can stall longer, I could actually merge this as well. Then it'll kick off another pipeline, which will actually do a canary deployment with Istio. What's cool about that is that I don't need to know anything about Istio to make that happen, but if I did want to know about Istio, then I could actually start investigating the deployment pipeline and I could start seeing what it's doing and I can even use that as a way to learn about how Istio does canary deployments.

Now that other pipeline is kicked off and maybe if we come back later, what we would see is that this year will be different and the image will be different, but that'll probably take too long so I'm just going to go back to the presentation now.

What did we see? We saw the configuration for the CI/CD system living right alongside the code. I was able to take that same configuration and run it on my cluster. I could use Kubernetes tools to interact with it, which means that somebody who wanted to could build something else on top of this if they didn't like that CLI – which is awesome, though, it's an awesome CLI – they could build their own. It was reproducible and everything was executed in the serverless manner.

What's Next for Tekton?

Last, what's next for Tekton? We are really excited that we just had our first release of Tekton Triggers and one of the reasons we were excited is because we were able to start using it immediately, so now we are actually dogfooding it and testing Tekton with Tekton, which is pretty cool. Early next year, we are hoping to have our beta release and as mentioned before, we're going to be focusing a lot on the catalog so you should be able to go to our Tekton catalog and see a whole bunch of tasks that you can use for common stuff that you want to do in your CI/CD and we're going to add other features like manual approvals and notifications.

If you like that, please join us. We have a Slack you can join, we have weekly community meetings, a whole bunch of the maintainers are going to be at Kubecon next week, we'll be at the actual event itself, and there is a CDF Summit happening the day before Kubecon, and you can follow tectoncd on Twitter.

Questions and Answers

Participant 1: I've looked at some of these, like Drone and other ones, and one of the fallbacks that we really liked about Jenkins was seeing all my latest test results, test result trends, build artifacts all in one place. How does that work in Tekton?

Wilson: I think what I showed you would probably be not the optimal user experience. My integration with the GitHub status check was extremely fast. There should have been like a link there I should have able to click and then go to some log somewhere. I think that one of the easy short answers is there is a dashboard project in Tekton, so you could be running this dashboard alongside your Tekton installation and that should give you visibility into logs and what's actually executed.

The other explanation, though, is as it is more of a plumbing piece, it would be up to whoever was creating their own CI system to take those logs and do what they want with them. If I was running this on GKE, for example, all the logs automatically go to Stackdriver. So you could use something that integrates with Stackdriver, or you can run something like Fluentd or something that collects logs. Basically, because it's Kubernetes, you can attach anything that collect logs onto it and then put the logs wherever you want.

We're not opinionated about how you get those logs but you can get the logs and put them somewhere that you want. Then, you can probably look at other systems that are built on top of Tekton that have more out of the box solutions. Jenkins X is an example of a system that's built on Tekton, so they're getting all the logs somewhere.

Participant 2: You mentioned this shift in left things, but the last few years three years on Kubecon, there was this team testing in production. What your opinion on that? Do you think they're crazy or...?

Wilson: I think it depends on how you define testing in production. I think the idea is you want to find everything you can possibly find before you get to production if you can, because once it's in production, you're going to expose some user to it. Then there's a certain amount that you can't ever get to and that's where things like monitoring and having canary rollouts and A/B testing – the testing can also take all kinds of different forms because it could be like, "I'm testing the requirements because I want to see if the users actually like this thing." There is a certain amount you have to do in production and that's fine, but I do think you want to do as much as you can earlier if at all possible.

Participant 2: There's another thing I didn't see. Every time I build this CI/CD system, I always make sure there is this "Oh, crap" button because if something bad got in production, you can just go and press one button and go back to how it was before.

Wilson: Like a rollback thing?

Participant 2: Like a rollback. If you deployed the whole pipeline, you have to go back to dependencies and put everything back to how it was before the pipeline.

Wilson: I would say that Tekton is very focused on the CI side of things and less on the CD. That's another thing that we would hope to tackle but we don't have any clear roadmap there. I think other tools that have more awareness of what is the thing that you deployed and "How do I undo that?" would be better.

Participant 2: There's a tool called GoCD from ThoughtWorks that's very good at tracking this and putting things back.

Participant 3: Another question is, you showed a folder with a bunch of configuration and it's easy for you to find a particular project. What if I have a lot of Git repos, is there a way that I can share those things across all that repos? Is there a complicated code?

Wilson: The question is, is there a way that if I have a lot of configuration and I want to share it across a lot of Git repos, is there a way that I can share that? At the moment, you would have to copy and paste them which is not fantastic, which is what we're doing inside of Tekton, but there's a proposal right now to add a new CRD type, it's like a catalog, so you could point at a catalog of tasks and they all get imported into your cluster.

Then another idea that we've had is, instead of saying the name of a task, you would say the URL where it's located, so then every time you tried to run something, it would grab it from that URL. I think the reason we don't have that right away is we're trying to be very careful about making sure that when you do look at the configuration, you can see exactly what ran and making sure it's reproducible but you will see something cool around that, actually.

Participant 3: Then also, what about security concerns like developers that insert malicious things into your pipeline?

Wilson: Because it is committed as code there, it would have to actually get committed and reviewed before it would actually execute.

Participant 3: Also, you kick off the pull request, which mean it kicks off the pipeline. Would it kickoff [inaudible 00:39:47] in that pipeline?

Wilson: You're asking, what if some random person came along and just put something into your pipeline and then open the pull request and it kicks off the pipeline? The answer that most systems end up having is usually have some way of being aware of who opened the pull request and if you're in the organization. This all comes comes into the triggering portion where you would want to have some decision that's like, "If this person who opened the pull request is a member of my organization, then kick off the pull request." If not, you wait for somebody who is in the organization to indicate that it's ok to run the tests, which is like the testing for Kubernetes itself does. There's like an ok to test command, you add it as a comment into a pull request and that kicks the whole thing off, so you probably want something like that.

Participant 4: Are there any capabilities related to test impact analysis so you can figure out that you don't have to withdraw certain stages of the build?

Wilson: I would say there's nothing like that right now but you could build that in if you want it, you could have a task that does that.

Participant 5: Are you only going to be pushing like other cloud native people? What I think of cloud native is more like Lambda, so serverless stuff where you don't even have containers so you can just, "Deploy to this." Any support for that type of deployment? Amazon have Lambdas, Google, I think they have Function.

Wilson: Cloud Run in Knative, yes.

Participant 5: Yes, everybody has those. There are containers somewhere underneath but you're not aware of that.

Wilson: I think that a lot of those are about going from some particular source code to just running it?

Participant 5: Yes, but there is the cloud infrastructure that just makes the containers and boots them and handles it but as a developer, you never just give them the YAML file.

Wilson: There's nothing like that in Tekton itself. One thing that you might find interesting is Tekton started off as Knative Build, which was part of the KNative Project. Knative is an open source serverless platform built on top of Kubernetes and in order for it to be able to go from source to deployment, it was using this thing called Knative Build which would take your source code and then build it into an image which would then be what ran. Then these days, they've deprecated KNative Build in favor of Tekton, which you use to build the image instead. It seems like the unit is still an image, though.

Participant 5: In Amazon you don't even see the image, you have this very, super thin layer that you're deploying.

Wilson: You can still connect to those systems with Tekton, you don't have to be building images but Tekton is using images to run.

Participant 6: [inaudible 00:42:51] how are you building containers in Tekton? Do you have a whole Docker-in-Docker?

Wilson: The question is, how are we building containers inside of Tekton? If you want to use Tekton to build an image, do you have to use Docker-in-Docker, like mounting a Docker socket or something like that? Tekton is opinionated about that, you can build the image however you wanted so you could mount the Docker socket if you wanted. In most of our examples, we use a project called Kaniko, which is one of the several projects that lets you build images without having to mount the Docker socket, so it's a bit safer.


See more presentations with transcripts


Recorded at:

Jan 28, 2020