InfoQ Homepage Podcasts Adam Jacob Discusses DevOps, Modelling Infrastructure, and Increasing Collaboration

Adam Jacob Discusses DevOps, Modelling Infrastructure, and Increasing Collaboration

Oct 02, 2023

Podcast with

Adam Jacob

Daniel Bryant

In this episode, Adam Jacob, CEO and co-founder at System Initiative, sat down with InfoQ podcast co-host Daniel Bryant and discussed the evolution and potential future directions of DevOps and managing infrastructure. Topics covered included the challenges remaining within the DevOps movement, how to model and manage infrastructure, and how to increase collaboration between developers and operators.

Key Takeaways

Over the past 14 years of DevOps and deploying and operating software, the tooling has changed dramatically. However, many of the underlying approaches and processes have fundamentally remained the same.
A core focus of the DevOps movement was improving collaboration and fostering a cooperative culture. Many organisations have embraced these changes, but the tooling has not always supported the level of collaboration required.
A fundamentally different approach to managing infrastructure involves modelling the real world within a series of “digital twins”. A model can be easily visualised to aid understanding and facilitate collaboration. Any changes in either the model or real-world state can be observed and approved or rolled back.
Using digital twins makes the importing of existing infrastructure potentially easier in comparison with existing infrastructure as code (IaC) tooling.
Developers and operators may be sceptical about using models, as this approach has been tried before with code generation tools, model-driven architecture, and business process modelling (BPM). Adam recommends experimenting with open source tooling for your use case.

Subscribe on:

Welcome to the InfoQ podcast

Daniel Bryant: Hello and welcome to The InfoQ Podcast. My name is Daniel Bryant, and today I had the pleasure of sitting down with Adam Jacob, CEO and Co-Founder of System Initiative. Now, Adam is no stranger to domains of infrastructure, DevOps, and continuous delivery, having previously co-founded the Chef Infrastructure as Code Project and related company. Recently I've been following his calls on social media for a need and a step change in the way we manage infrastructure and deploy applications. In this podcast, we dive into those concepts such as modeling infrastructure, establishing digital twins between models and real world infrastructure, and what a typical application deployment workflow might look like in this proposed new world.

So welcome to the InfoQ podcast, Adam. Could you introduce yourself to the listeners please?

Adam Jacob: I'm Adam Jacob. I'm the CEO of System Initiative, and in a previous life I was the CTO of a company called Chef that did infrastructure automation. And then before that I was a systems administrator, which is really kind of what I am in my heart.

What has changed over the past 14 years of DevOps? [01:00]

Daniel Bryant: Fantastic. I've known you a lot from the Chef days, and I love your Tweets, I'll definitely put your Twitter handle, X handle in at the end of the podcast because I think you definitely have some pithy things hopefully we're going to dive into today. I was particularly interested in your Second Wave of DevOps blog post. You referenced that classic talk by John Allspaw and Paul Hammond, "10 Deploys Per Day: Dev and Ops Cooperation at Flickr". Now I remember while watching that recording and just being blown away back in the day, what do you think has changed over the past 14 years now and what are the biggest challenges still remaining?

Adam Jacob: Funnily enough, I think almost nothing has changed. What's changed is the tooling. So we've iterated on each piece of the process. If you go watch that talk in Flicker, they were talking about how they collaborated together to deploy Flickr a lot. They built bespoke deployment solutions, they had a little webpage they went to and they clicked deploy and whatever. But what did they do? Well, they did dark launching and feature flagging, they had automatic deployment from trunk, they did continuous integration, they did continuous delivery, they had integrated monitoring and trending that worked both for the developers and also for the operations people, they shared both metrics with both, they had product telemetry built in that they could see both operationally and in engineering, on and on and on. You can go through the list of things that we say people should do, and it's roughly identical to what they did. You got to squint a little because a lot of technology that didn't exist at the time, EC2 I think for example, was either brand new or a year old or something.

Daniel Bryant: Yes, pretty much emerging, right? Yes, back in 2009.

Adam Jacob: Pretty emerging, right? So forget about Kubernetes or whatever.

Daniel Bryant: Yes, Kube what?

Adam Jacob: But that's kind of to my point, is that when you look at what we've decided to do in the space, we basically took that pattern that they showed us and that we all ... Other people had that pattern too, they weren't the only people who were doing it, but they're the ones who said it. And what we've done is iterated on each individual piece, but we haven't really looked at the shape altogether. We went, "Yes, that's the way. This is the right shape. This is how the system should work. This is the arc of how things ought to be," and then we've improved each piece in its patch. We were like, "Well, at the time they were using Subversion, now we use Git. They were using Ganglia for monitoring." Most people don't even know what Ganglia is.

Daniel Bryant: I used that back in the day, yes.

Adam Jacob: I used it, Ganglia was amazing, and now we use Prometheus and Grafana, on and on and on. And so we've rolled each piece, but we haven't really changed the way that you interact with them or how you think about them as a holistic system. We've really kept that holistic system identical to what they were doing, we've just improved the pieces as we go.

Has DevOps evolved successfully to meet the needs of managing infrastructure and deploying software? [03:38]

Daniel Bryant: Fantastic, interesting. So how has the evolution of DevOps helped or hindered along this past? Again, we've had 14 years of pretty much, if you argue that DevOps has kind of started by that movement, what do you think has evolved from that? Clearly a lot of cultural changes have gone on.

Adam Jacob: Tons of cultural changes, all healthy for the most part, although I think we actually are seeing some backlash now to some of those cultural changes because I think the tooling, for all that we've tried to optimize it, and we have done a really good job of optimizing it, what we've learned is that we've sort of optimized it as far as you can optimize it. If you think about the shape of the system, in 14 years, we've replaced each of those tools at least twice, right? Every single one's been replaced at least twice except for Git, right? But Subversion did get replaced so Git counts as a replacement.

So after replacing each one two, maybe three times, sometimes four, if you look at the results, so what is the experience people are having, particularly in the large enterprise with adopting these DevOps tools? The answers are meh. Better than before undeniably. So if you go back, if you are old enough to have worked in pre-2008, 2009, internet stuff, it's undeniably better now than it was then. So please hear me say, so much better. But at the same time, we're still struggling to deploy more than once a month in a lot of places, right? The failure rate still incredibly high. It's half our ability to collaborate together is actually sort of going backwards.

So you're seeing a lot of folks now in response to that failure to get to the kind of higher velocity, safer working environments and more collaborative environments that we know bring about good outcomes, you're starting to see people back off and be like, "Well, what we should do is go back to a world where we had a platform layer that were operations people and then we should have application people who only interact with those people over prescribed APIs. It's a software contract between them, but we don't talk, no collaborate." And what we're saying there is actually that we don't want to collaborate, right? We're saying, "Hey, the collaboration thing didn't work out." And that's because what we got was religion, but the tooling let us down. So we all understood we should be collaborating, but the way that we collaborated, the system that we actually built isn't very collaborative most of the time, right? Usually the only point of collaboration is code review, which it's in the name that it's a review, it's not collaboration.

Daniel Bryant: Yes, it's not pairing, right? It's reviewing what you've done, right? Yes, good point.

Adam Jacob: Yes, somebody else did something, I review it, I tell you if it's good or bad. And so I think there's a very interesting question about, because our aspirations were so much higher, nobody set out on this journey to be like, "Oh, and my goal is if I deploy once a month with a 50% failure rate, I'm happy." Do you know what I mean? They wanted to deploy every day, they wanted those interactions between operations and developers to be easy and smooth, they wanted that holistic system to be built and designed. It's a rational thing to now look at the outcomes and go, "And it didn't work out for me. I tried it. I believed your dogma, I did what you told me to do and the outcomes leave me wanting."

And so maybe the problem is your dogma, maybe the problem is that you were just wrong about that this was a problem. Maybe it wasn't, maybe it was actually better when operations people were Morlocks and developers were Eloi, and maybe that was a better time. Certainly it might've been if you were Eloi. And that's to me why we need this second wave of DevOps tooling. We need to admit that the problem we have here is actually a systemic problem. The way that we've put the system together, the way we've decided to approach the problem in the large, not in the specific, it's not like Terraform is garbage or Pulumi sucks or Grafana is unacceptable, it's not like that.

What it is saying the way we've decided to work and the way we've decided to put the systems together to allow us to do this work, that's what's holding us back. And if we want an order of magnitude better experience, then we need to design a system for that experience and we need to take everything we learned in the last 14 years plus of doing this kind of work and we need to really dig deep to figure out, how do we build something that could possibly even be better than what we do today? And it's a little scary and it's kind of hard and also it's kind of necessary.

And you can see it starting to crop up around the edges, there's companies like System Initiative, I think the guys that make Wing, even Dagger, I think you can sort of think about this is a thing that, I don't know that he would put it this way, but if you think about what he's doing, there's a real subversion of the paradigm of how we think about CI and CD that's hiding inside Dagger. And so does that subversion of that paradigm then lead to new experiences that do actually help us produce an order of magnitude change? And that to me is what that second wave of DevOps is all about and I'm really hoping that that's a thing people want to do and build.

How do you get exec buy-in to make required changes for deploying and operating software successfully? [08:37]

Daniel Bryant: Fantastic. So I'm guessing it's going to take, "an army", for want of a better phrase, to come along with you, right? Because that's one of the things I've found, whenever I've tried to change things as a consultant in an organization, you had to get top-down buy-in as much as bottom-up, and I'm thinking you've got to do that on a global scale, not to sound too grand, right? You've got to bring folks along contributing so they can plug their things in, you've got to get that buy-in too, right? From the top, the exec folks?

Adam Jacob: Yes, 100%. And I think, look, it starts with the recognition that we didn't get where we are alone either. It's not like I started DevOps or whatever, I didn't. But even if I tried to make the claim that I did, it was a massive community of people working together, sometimes together, sometimes separately, sometimes across purposes even to get to a spot where you could build. I think the situation we're in now, it's so early in terms of the recognition that the problem is systemic. I think there are people who do recognize that the problem is a systemic problem and they are starting to think about solutions differently now than we have been in the last 14 or 15 years. I think that's relatively new in terms of what's happening in the movement in general.

But in order for it to work, what we need is more people to recognize that that truth is true. And if we just get more people to say, "Yes, you know what? You're right, it is that the way we interact with the system is wrong, it could be better." And then the question is, on what vector could it be better? So System Initiative is my bet on a path that says it could be better, right? I'm like, I've spent four years doing R&D that leads me to this solution that says, "I think better could live in this idea, in this model."

But I think there are probably other answers. And what we need are more people investing in finding those answers. We just need more people to be like, "Yes, we should change the way it works. We should change that paradigm because we want the outcomes that we always wanted. We weren't wrong to say that we wanted a smooth flow of work from development to production that was safe and collaborative and agile. Of course, we were right that those are the things we wanted and we shouldn't give up on them because we didn't have the foresight 14 years ago to realize that it wasn't quite going to be good enough to just automate what we already did, that actually we needed to change how we worked in a more fundamental way rather than just automating each piece of the puzzle as we worked 14 years ago."

How should we shift our approach to managing DevOps? I’ve seen you talk about "digital twins" and "modelling" infrastructure [10:53]

Daniel Bryant: Yes, that's very nicely put, I can definitely relate to that as the tooling has changed and as I've glued the tooling together has become a bit different. But now I know you are trying to, as you mentioned, fundamentally shift. For the listeners, could you introduce the paradigms you are thinking of? What are the step changes? What are the big changes with something like System Initiative?

Adam Jacob: So I mean the biggest one is that when you look at what we decided to do to sort of implement DevOps at a technical level, what we did was wrote code to try to automate the processes we already had. And we've just been doing that over and over and over again, which was great, don't get me wrong, I made my whole career doing that and I love doing it, I'm still doing it, right? But inside there is this interesting fundamental supposition that says, for a lot of the work that we do is code actually the correct canonical representation of what we've decided to do? Because code brings with it a bunch of really incredible properties and some downsides.

And one of those downsides is that it's pretty hard to automatically generate code then tweak the code and then allow that to go back and forth between whatever the other representation is, which then causes an interesting problem with visualization, right? If I'm trying to think about how I can show you the information in terms of different jobs or different functions or different things, code requires me to execute the code in order to get there, I have to compile it to something and it has to become some kind of in-resident sort of mechanism.

And so one of the biggest realizations we had as we started trying to think about how we could make things in order of magnitude better was that the number one thing holding us back was actually that the canonical representation of what we wanted wasn't code. And it wasn't because code is bad, it's not bad, obviously it's not, but it makes it so we can't build new ways of collaborating because we can't interact with the data in a way that's different. There's only one way to interact with it, do you know what I mean? You can interact with it words in an editor. You want to change it, what do you do? You open an editor, change the words in the editor. That's the loop and there's nothing else you can do about that loop.

And so sort of drawing some inspiration from some other domains, we started to think about, well also interestingly enough, a lot of what we need to collaborate on is the intersection between the model of an application and the infrastructure that runs it. And so we started thinking about things like digital twins and some of those ideas and saying, "Well, what if we built a simulator for the system that you wanted and we made the simulator itself programmable?" So instead of saying that you got all the power of code because the canonical representation was code, what if we said the canonical representation was a simulation and what you did was program the simulator to behave the way you wanted it to behave? So if we wanted to see what AWS was like, we could model AWS and then we could use the simulator to tell you in real time, "Does this look smart or not smart?" We could infer configuration, we could do all this stuff because the model is an active thing. And then we could get you back the power of code, of writing code and all that ability to write complex stuff by just saying that what you do is write code to describe the model's behavior instead of write code to describe the thing itself. Does that make sense?

Does something like System Initiative provide a new abstraction to building infrastructure and deploying software? [13:56]

Daniel Bryant: I think it does. So arguably, code is a model in itself, because I always say that you are writing code sort of modeling the business problem, but I guess you're taking it up a level of abstraction?

Adam Jacob: Just a little bit, yes. And saying that by doing that, what it lets us do is now build interesting new ways to interact with the information. So we can build a visualization, for example, that lets you visualize your configuration and then we can use the relationships between those components to then derive configuration. So a good example is let's say you have a service that you've put in a container and you want to expose that service on a port, so it runs on a web server it's Port 80, and you're going to deploy that service to Amazon and then you're going to put a fleet of them behind a load balancer. So if you imagine how you write all that in infrastructure as code, maybe you write it all out by hand, but certainly the syntax changes between Docker and how Docker expresses a port number and how Amazon's Load Balancer expresses a port number is different. And that's different still from how it gets expressed in the egress rule that allows you to set up a security group to actually run the thing.

And so with System Initiative, because what we're doing is programming this underlying model, we can take the one-to-one syntax of Docker. So when you're looking at a Docker component, you're talking the language of Docker, you say, "I want to expose a port, it's 80/tcp." Then you can say that Docker image has a relationship to say the configuration of your operating system. So in the case of CoreOS or Fedora, it is a butane config. And that when we connect a Docker image to a butane configuration then what you mean is, "I want to run this container on this instance," and it would write you a systemd unit file. And if you expose to port, it would then send the right options to systemd to launch that container with the right port exposed.

And then we could wire that up to an EC2 instance and it would take that data and automatically base 64 and code it and stick it in the user data. And then we could also wire it to the egress rule and to the load balancer so that if we then went into our container and we said, "Hey, our application doesn't run on port 80 anymore, it runs on 88," you could change it in that container and it would cascade across all of that configuration without you having to think about it, because the underlying model of how those relationships work is fundamentally just code. So you're writing transformations that say, "Hey, I know how to take in the shape of a Docker image, I got a little bit of JavaScript that understands how to translate that into a system to unit file, into an egress rule, into all these other things." And so you get this programmable simulator that knows how to derive all this intelligence for you.

And the side effect really is that we can now build user experiences that you couldn't build before, right? If you think about how do I get you immediate instantaneous feedback that a tiny declaration in Terraform is correct, you kind of can't. In the end, I got compile it, I got to run it, I got to see the thing. Here I can just run a JavaScript function on the object and return true or false and be like, "This is good, this is bad." And suddenly what used to take maybe a minute or two minutes or five minutes to tell you if you were right or wrong takes no time at all, right?

How do you keep the model and the state of the infrastructure in the real world in sync? [16:58]

Daniel Bryant: Yes, super interesting. The thing that jumped out there is clearly the JavaScript function has got to be in lockstep with the real world, so to speak, the simulation has to be in lockstep. How do you go about doing that? Because I'm guessing Amazon's changing, there's new things popping up all the time.

Adam Jacob: Yes, I mean, one trick is that you separate the two. So when you think about modeling something in the simulation, you're modeling both the theoretical configuration of it, so we call that piece of the component, then the other side is a resource, which is the actual information about the real world thing, right? And you let both be true. So in infrastructure as code, you say that the code is the truth, and the job of the code is always to manipulate the outside environment to match. Here we don't say that. Instead, we say you can do whatever you want to the model and then it's a representation of a real world thing that also has its own state that grows and changes. And what you're doing is just deciding to reconcile that state between the two.

And it may be that the reconciliation is to update the model or it might be to update the other side. So if you think about that bidirectionality, one thing you get, you get bidirectionality because the model has been split apart in this way. And because the model is itself data, it's really easy to say, "Hey, somebody went to the AWS console and added a tag to this EC2 instance." Now the model doesn't have that tag in it, so we can just tell you, "Hey, that's not there. Do you want to update the model or do you want to delete the tag?" And that can be a choice that you can decide. Some of that's working in System Initiative today, some of it's not, some of it was actually working in earlier prototypes and we sort of pared it back down and we're sort of building it back up. So what I'm describing is fundamentally true. You may not be able to do that right now in the software, but you have been able to do it in the past.

The second part of your question though, I think, is more about how do you cover off on the spread of all the stuff that happens in the environment? And it's a question you get a lot. But the answer is actually, it's easier than you think it is. So there's a core set of services that everybody uses. Once you cover off on those core services, the long tail is long, but the core gets you quite a bit kind of quickly. So how much coverage do you really need in order to be pretty effective for most people's infrastructure? It's a fraction of what is actually on offer. And then if you make the authoring experience easy, so for System Initiative we've integrated the authoring experience into the product so you don't actually pop out of the product to change it, you can just literally click a tab and it'll flip up an editor and then you can start editing System Initiative.

Daniel Bryant: Oh, you write code, like JavaScript code to transform it?

Adam Jacob: In the product.

Daniel Bryant: Interesting.

Adam Jacob: Yes, but also to extend it. You can model new things in it, change its behavior, add new actions. You can do whatever you want. And the side effect of being able to do that is that it's really easy for people to then extend the model, hopefully it'll be easy for people to extend the model on their own. And so it's really a question of, how do you build a community that can serve their own needs and then share that with each other in a way that makes the whole greater than any individual person or company could do? And so that's why it's open source, it's why that ability to contribute directly is going to be built into the product. You won't have to leave the product to contribute your work, there's literally a button that says contribute.

Daniel Bryant: Context switching minimized?

Adam Jacob: And you can click it and be like, "Yep, these are the models I want to share with other people."

How does GitOps relate to this new way of managing infrastructure? [20:09]

Daniel Bryant: Absolutely fascinating and very different to think about it, but a couple of things that jumped out to me because I've been in the Kubernetes world for the last four or five, six years, and GitOps is a thing there, whether you believe it or not, and GitOps pretty much has one-way reconciliation. The code is the truth, and if I change that tag or someone logs in and does something, we get rid of that straight away.

Adam Jacob: We revert it.

Daniel Bryant: 100%, right? But you are saying, and it can be annoying sometimes, I'll be honest, but it's the dogma arguably, the GitOps way of doing things. So how does GitOps sit with your way of thinking.

Adam Jacob: It doesn't.

Daniel Bryant: Interesting, okay, yes.

Adam Jacob: Look, it's a perfectly viable way. If you wanted to work that way in System Initiative, you totally could, your answer would be just always trust the model. But if you think about it, it doesn't actually make any sense that that's how we do it. If you are in the middle of an outage and you know what to do and you could do it, why wouldn't you? And the answer is, "Well, because of my dogma. My GitOps told me that I had to do this other way."

And so let's say it takes 10 minutes for your changes to flow through your GitOps pipeline, which would be I think a relatively reasonable amount of time, 10 minutes, probably real, maybe less, less would be pretty rare on a big infrastructure, my guess would be, five minutes, whatever, let's give it five minutes even. We'll give credit. If it takes five minutes for that change to happen, the only reason we're doing it through that mechanism is that that's the only way you have to understand what's changed, who changed it, what the controls are. There's a bunch of properties we're getting out of making that choice, but it's not that that's the only way to get those properties.

So rather what would be better would be, look, make the change however you want to make it. If that's logging directly into the router, log into the router and make the change. What you need is a system that knows that that change happened, and then it needs to tell you, "Hey, this happened. Is that now the correct configuration of this thing or is it the wrong configuration of this thing?" And if you could do that, then you could actually collaborate again, because people who had expertise, they don't need to become experts in the GitOps version of what they're doing, they can just be an expert in the thing they're an expert in, and they can go do it in whatever the thing is that works for their expertise.

And then the system can support you by saying, "Hey, this resource changed. It's not the way you thought it was and now you should track it," at which point, we know who changed it, we know we can track what changed, we can track when it happened, we can track that that's now the correct current state of this thing over time. All that stuff becomes really possible again, and we didn't force you into a workflow that didn't make sense for you, which for the most part then winds up forcing you down a path of saying there's very few people who can actually collaborate on that thing because you've got to understand all the things that go into the GitOps part of the flow.

Daniel Bryant: The magic incantation?

Adam Jacob: Yes, kind of the whole magic loop. And so when you think about GitOps in the context of Kubernetes, it makes a little more sense in that what you're doing is writing a declaration that you feed to a database.

Daniel Bryant: That is the Kubernetes way, yes.

Adam Jacob: And then this database has another thing that listens to it and the Kubernetes thing cranks around and runs the control loops and blah, blah, blah. Because it's declarative and convergent in that way, it sort of makes sense that that's the way you float through it. Even in Kubernetes, I'm not sure that that's actually the most efficient way to think about it, but definitely it's not the most efficient way to think about it when you get outside of a system that works like Kubernetes.

So yes, I think, it's not that GitOps is garbage or whatever, it's not, it's manifestly better than not doing GitOps or having no mechanism at all. But if you think about the user experience of GitOps, it's actually not that great. It's better than nothing, but you're like, "Okay, here's my code review. How long does it take to know if I got it right or wrong? How do I react if I got it wrong? How do we loop around it?" It's a lot. And that's actually the thing that I think we need to figure out how to fix.

How do you import existing infrastructure into the model? [23:53]

Daniel Bryant: One thing that caught my attention there as well, because we're not relying on that sort of one way flow of truth, I'm guessing onboarding, importing your stuff could be easier? Because I always struggle with that with both Terraform and Pulumi, it was lovely, great, but I've got a massive estate, how do I get that into this model of code, right?

Adam Jacob: Yes. And so the dream here is, you put your finger right on it, so again, in earlier versions of System Initiative, this worked and we've been tweaking the model to make it more powerful. And so as we've made it more powerful, we've actually sort of pulled back on some of those features that we had working earlier when the model was a little less powerful and sort of adding them back in as we go.

But yes, discovery is pretty brilliant. It's pretty great to be able to give me some credentials and then we will run a function that knows how to look at the data about whatever the kind of thing is you've modeled, and then it writes out the model and then walks the relationships that it can understand and says, "Hey, is any of this data here? And if so, build me another model and relate it to me this way." And just sort of this cascading flow of models that then do that discovery for you. So we're ways from this, but our hope is that the user experience actually is sign up for System Initiative, add the credentials, run a discovery process.,It finds the resources that already exists, builds the model for you backwards.

It can't build a model that tells you semantically what they mean, right? It can't infer, "These resources belong to this application," you're going to have to do that yourself, we'll have to provide some mechanism, especially the semantic ones at the higher... The low level ones we can infer, but the high level ones of, "This is an application," or, "This is my production environment," somebody has to tell you that stuff because they only exist in our heads. But yes, and one of the things we think is cool about that is that if you were doing GitOps already and you had these systems in place, you don't have to stop doing them to get value out of something like System Initiative, right? Basically if you wanted to keep doing them that way forever, you could, you would just turn the reconciliation on one way. You'd just be like, "Just always reconcile the model to the reality."

Daniel Bryant: And still have the benefits of all the other things that come along with what you do?

Adam Jacob: We could still give you tons of visualization, you could still write policy checks that tell you whether or not a model or a resource is in the shape it needs to be in or works the way you expect it to work or matches your security or compliance policy. There's a ton of things you can do with that information.

How would a tool like System Initiative help with Dev and Ops collaboration? [26:09]

Daniel Bryant: Fantastic. If we wind it back to the beginning of our conversation now, Adam, how would a tool like System Initiative help with that collaboration? Perhaps the question I'm asking is, what would a typical workflow be for the stereotypical DevOps day?

Adam Jacob: So the first thing is that the primary way of interacting is that it's an application that's running in the browser. So it has more in common with Figma or Google Docs or Notion than it does Terraform in your editor. And then that application is fully multiplayer. So if you and I are in the same workspace and we're working together in the same change set, actually you can put something on the diagram or change an attribute and that attribute will update automatically in my browser like it happens in real time, right? So that's kind of step one.

So from those properties, then you can start to rethink things like review. So if you think about how code review works, right now there's no easy way to tell what the blast radius is of a change. So here there is, because what we have is a model of all the things. And so if you change one property in one place, we can show you what all the effects were of the change of that model. So let's say you change something that then impacts the shape of a cluster that's being configured to run Kubernetes, and we decide that there's a corporate policy that says, "My Kubernetes team, my platform team needs to approve any changes that impact the cluster configuration." So now that change set before you could apply that change would require the approval of somebody else to come and join you and be like, "Hey, what's that change you're making to the Kubernetes cluster again?" And you're like, "Oh, here's what it is. Here was the blast radius of what it was," and you can do that interactively. So rather than doing it in review, what they're coming into is the literal change.

Daniel Bryant: Because you have to build that mental model often and then pull request, right? So there's change in terraform, you're building a mental model, which is freaking hard at scale, right?

Adam Jacob: Yes, which is freaking hard at scale. But if we have all that data and we can visualize that information for you, what we can change is the experience of how you review it. We can be like, "Hey, show me what you changed, show me what the impact was on the cluster, show me why you changed it." And you could trace that back through the relationships and be like, "Oh, it's because you changed," whatever, "The way the routing worked, which then reconfigured the service match, which then causes it to redeploy this thing, which then causes this to happen and that's why you need my approval."

And you can imagine lots of interesting new designs for like, okay, when you apply a change that in System Initiative, you're applying for the configuration, but you're also applying the activities that'll be required to the actions you need to take to make it happen in the real world or vice versa. So maybe that works like a submarine. If you watch The Hunt for Red October or whatever and they want to launch nuclear missiles, you need the two guys with keys and they got to come turn them on or whatever. You could imagine all sorts of interesting interfaces you could build because now it's data, it's all active information and we can use that information to build all kinds of experiences that we couldn't build before simply because the information was locked up in a representation that you couldn't understand.

Will developers be reluctant to make the shift to this way of thinking and operating? [29:08]

Daniel Bryant: The attraction of sort of not having to build a mental model is very powerful. I imagine people are going to be a bit reluctant to jump on board, or not reluctant maybe, but it's a big jump, folks, isn't it?

Adam Jacob: Yes, I mean it is. Look, skepticism is warranted, right? You should be skeptical, just so we're clear. The history of things like this that look like this not working out is pretty high.

Daniel Bryant: I was thinking model-driven architecture back in my Java days in 2000s then there was a bunch of other things that were...

Adam Jacob: Yes, we had CASE tools.

Daniel Bryant: That's it, yes. BPM tools as well, auto BPM.

Adam Jacob: Yes, we were going to generate the code from the UML models and nobody ever had to write software ever again because we were just going to build-

Daniel Bryant: That's right. I remember that back in the 2000s.

Adam Jacob: Yes, we're going to build the right set of UML models. And look, I get it, I do. I really do get it. And what I think we've learned is that your knowledge of the domain really matters here. And I love this domain, we love this domain, this is where we've built our careers, and you have to want it to be a power tool for power users. In a lot of these cases, what happens is people run to simplicity. They're like, "Well, the goal here is to make it simpler to use the system or for people who aren't experts to be able to understand it." That's not our goal. Our goal is to make it so experts can express their expertise in as much complexity as they could possibly ever need to do it in a way that's better, that's faster, that's safer, that works better for them, that allows them to collaborate with other experts.

But it is a power tool for power users by design, that's why it took us so long to build, right? If you're trying to just build a toy that's like, "Hey, I want a little visual diagram that allows me to string together some configuration," or whatever, a little node flow kind of thing, it's pretty easy to build a demo of that and be like, "That's cool, but it's really just a drawing effectively." But when you think about, "Okay, but how do I make sure that what that's doing has all the power and all the expressive potential of doing it in code?" And that's the challenge for System Initiative. And it's why it's taken us so long to build, and it's why we've open sourced it when we have because all of the fundamentals of that shape are there.

So there's a lot that still needs to be built about what that experience is and how it grows over time. But that core experience of saying, "Okay, what if it was a huge reactive hypergraph where at each node on the graph is a function that can execute, it takes arbitrary inputs and outputs and they're reactive to their environment in the same way that react or view are reactive in the browser. And that paradigm then stretched across all of your infrastructure, and then we could start to build semantic abstractions on top of them." Those are the technical underpinnings of what this thing is. So it's not like it's a little diagram tool that does a funny party trick, it's a hard technology problem under the hood, the output of which we can then build incredible visualizations on top of, which is super cool.

But all of that is to say it has to work and it has to actually be able to solve those problems, and today it's on the journey of being able to do that. If you're a person who hears this podcast and the things we're saying, you're like, "That's super interesting, I want to know how that works and play with it and maybe tweak it or change how it works in the future or collaborate," it's ready for you now. If what you want to do is replace Terraform with it, it's not, you know what I mean? Because it's just there's more work to do to bring those fundamentals up, there's performance optimization to be done, there's lots more models to be written. There's a ton of work to do, but the foundational elements that make this work are all there.

How are you handling the (OSS) licensing of System Initiative? [32:31]

Daniel Bryant: Fantastic. I feel like I've got to ask, and I've been following your Tweets on this conversation, with this sort of move away from some of the OSS models, I don't want to dive into too much of it because I know it is a complicated situation, but I am thinking if folks are listening to this podcast going, "Yes, I'd love to contribute, but hey, what's going to happen in the future? Everyone's got to make money, right?"

Adam Jacob: Yes, am I going to get rug pulled?

Daniel Bryant: Yes, pretty much.

Adam Jacob: Look, I'm a pretty open capitalist about it. I build venture-backed startups essentially for a living and like it and I like the whole thing of it, right? So I've got plenty of capitalism inside of me. But I think the best way to build something like System Initiative is to do it in the open and to allow people to thrive in whatever way they're going to thrive from using the software. The question of is it better to close up or how do you make money?

The best way to make money selling something is to say, "If you want it from me, you pay me money. If I have coffee and you want coffee and the choice is free coffee or pay me five bucks for coffee, you pick free coffee, it's not rocket science." And so what you have to do is sell a product for money. You have to say, "Nope, if you want coffee, I have coffee, you don't have coffee, if you would like this delicious coffee, $3." And you can not like the price, at which point you don't have coffee or you find coffee another way, you go home and brew some coffee, you don't have to get it from Starbucks or whatever, you can just make coffee yourself.

So our business model is really straightforward. We build a product that we sell for money, and it's called System Initiative. That product is built on open source software. Every line of it is open source software. It's all Apache licensed, always will be. We're not going to build proprietary software. What we will not do is build System Initiative for free. So if you want System Initiative from me, you pay me for it because there's value in the product we produce, there's value in the engineers, there's value in the supply chain that we put together. There's value in the services we run, there's value in the community.

And then our job is to steward the growth of that software and that community for whoever it is that's going to get value out of it. Some number of those people are going to get value out of it through consuming System Initiative, some people are going to get value out of it because they want to build a business on top of it. Maybe you want to use System Initiative as a console layer for your own cloud provider, awesome, we want you to do that. Maybe you do that by having a BD deal with us that allows you to resell what we already do.

But maybe you don't want to do that. Maybe you're like, "I hate your terms. Those aren't good for me." Great, then what you do is take that open source software and package it up and build a distribution of it. What you can't do is call it System Initiative, right? I'm wearing an Allman Brothers shirt you have to call it Allman Brothers, which you probably shouldn't do because they'd probably sue you, but you got to call it whatever. And you have to build your own brand and your own supply chain and your own product to serve your customers in the way that you want to serve them, which I want you to do because what I want is this transformative technology to go out in the world and transform the world. And if I want it to do that, I can't limit the upside. I can't say that the only upside is allowed to flow to me. And we fundamentally believe that if we do that, the pie will get bigger and that pie will get bigger at a rate that's faster than my ability to get 100% of a smaller pie, right?

And it just means that it's okay that other people are going to make money with it, and it's okay that other people are going to do what they need to do with it. I want them to do it because I want to see that change in the world and because good for me, it's good for my business, it's good for everybody. And that alignment requires us to be good stewards. For that strategy to work, we have to actually take care of those people, we have to actually accept those pull requests, we have to actually work with that community, we have to actually help those people work together because we want to continue to be the place where that goodness originates, and the only way to do that is to be expansive in what you want people to do. So that's how we do it.

How can interested listeners get in contact with you? [36:29]

Daniel Bryant: I love it. I think it's a great point to wrap up, Adam, if folks are liking what they're hearing, what's the best way to reach out to you to get involved? How can folks help?

Adam Jacob: The best way to do it is to come to the website. You can sign up for an account and kick the tires. There's a pretty thriving Discord community so that's the other thing to do is come and hang out with us in Discord. You can find me on Twitter or X or whatever we have to call it now, I'm AdamHJK. You can also find me on Mastodon and Bluesky and all the other things. Yes, that's the right thing to do.

Daniel Bryant: Fantastic, I appreciate your time today, Adam. Thank you very much.

Adam Jacob: Thank you.

About the Author

Adam Jacob

Show moreShow less

More about our podcasts

You can keep up-to-date with the podcasts via our RSS Feed, and they are available via SoundCloud, Apple Podcasts, Spotify, Overcast and YouTube. From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Previous podcasts

[Video Podcast] Improving Valkey with Madelyn Olson

Developers Can Improve the ESG Aspects of Software by Tackling Early Ethical Debt

Startup Software Architecture - You Never Really Throw it Away: a Conversation with David Gudeman

[Video Podcast] AI-Driven Development with Olivia McVicker

InfoQ Software Architects' Newsletter