Transcript
Kim: My name is Gene Kim. I've been studying high performing technology organizations since 1999. That was a journey that started back when I was a CTO and founder of a company called Tripwire, in the information security space. Our goal was to study these high-performing organizations that simultaneously had the best project duty, performance, and development, the best operational reliability and stability, as well as the best posture in security and compliance. Understand, how did they make their good to great transformation so that other organizations could replicate those amazing outcomes? In a 21-year journey, there are many surprises, by far the biggest surprise was how it took me in the middle of the DevOps movement, which I think is urgent and important. The last time that any industry has been disrupted, to the extent that our industry is being disrupted today was likely in manufacturing when it was revolutionized through the application of the lean principles. I think that's exactly what DevOps is. You take those same lean principles, apply them to the technology value stream that we work in every day, and you end up with these emergent patterns that allow for organizations to do tens, hundreds, or even hundreds of thousands of deployments per day, while preserving world-class reliability, security, and stability. Something that I didn't even think possible 15 years ago.
In 2013, I co-wrote a book called, "The Phoenix Project," and I can't overstate just how much I've learned since then. What I'm presenting here is not so much DevOps, but on something I wrote last year called, my love letter to Clojure. Randy Shoup, who's on the program committee, reached out and asked if I would present on why I wrote this letter and what I learned. Of course, I said, yes. The intended audience for this talk is anyone who has even the remotest interest in functional programming, as well as anyone who used to love programming, and yet over the years, maybe the joy associated with programming has faded. Because that was absolutely my case as well.
What Is DevOps?
Over the years, I co-authored a bunch of books, including, "The Phoenix Project," DevOps Handbook, the "Accelerate" book with Dr. Nicole Forsgren and Jez Humble, and most recently, "The Unicorn Project." I'll just put it out there a definition that we put into "The DevOps Handbook" that DevOps really is the architectural practices, technical practices, and cultural norms that allow us to increase our ability to deliver application services quickly and safely. This allows for rapid experimentation and innovation, as well as the fastest possible delivery of value to our customers, while preserving world-class security, reliability, and stability. Why do we care about that? So that we can win in the marketplace. Yet, as much as I love that definition, there's a definition I love even more that comes from Jon Smart. His definition is simply this, it is better value, sooner, safer, and happier. Let's really zoom in on one word, which is happier. I think no one can argue that anyone wants actually worse value later, with more danger, with more misery.
Clojure Brought Joy of Coding Back
Clojure really introduced the joy of programming back into my life. To put this into perspective, for decades, I self-identified as an ops person. This is despite getting my graduate degree in compiler design and high speed networking. It was always my observation that it was ops where the saves were made. It was ops who saved our customers from terrible developers who didn't care about quality. It was ops who protected our applications and data from bad actors, because it certainly wasn't ineffective security people. Yet, four years ago, when I learned Clojure, I changed my mind. I now self-identify decisively as a developer. I think it's because development is so fun these days. You can do so many miraculous things with so little effort. Finally, at the age of 49, I can finally build things that don't collapse in on itself, like a house of cards, which is something I've suffered from for nearly 30 years. Many of these aha moments really came from learning Clojure.
Why Functional Programming?
The famous French philosopher Claude Lévi-Strauss would say, of certain tools, is it a good tool to think with? There are things in functional programming and in Clojure that I think are astonishingly good tools to think with. I'm going to make this claim that 90% of the errors I made for 30 years have almost vanished. I'm going to describe why. I remember in 2016, or 2017, I remember seeing this amazing graphic on Twitter that described the difference between passing variables by value versus passing variables out by reference. Back when I was studying programming languages in 1993, most mainstream languages supported only passing variables by value, which meant if you passed a variable to a function, and you changed it, it only changed the local copy inside the function. That's the way most programming languages did things. It's not bad, but I certainly viewed it as very inconvenient, because if you wanted to modify a variable, you had to return it. This meant that if you had big structures or classes with tons of fields, you had to do a lot of typing, which is tedious, error-prone, and very time consuming.
I often found myself complaining about this wishing there were a better way. The way we often addressed it was using memory pointers, which introduced a whole different set of problems because it was so easy to change something that you shouldn't, potentially crashing the program or introducing grievous vulnerabilities into the program. This is now considered so dangerous that few programs besides assembler, C, and C++ support pointers. In fact, one of Rust's main selling points is that it's safer than these other languages. In 1995, I got introduced to what I thought was a huge innovation in programming languages, that you could pass variables by reference. It showed up in C++ and Java, and it allows you to pass a variable into a function and actually modify it. I loved it, because it was such a time saver, because it let you write less code. That's why I thought it was so great.
Four years ago, learning Clojure, I changed my mind. I think one of the hallmarks of functional programming languages, whether it's Clojure, Haskell, F#, is the notion of immutability and pure functions. They typically don't let you change variables. The functions need to be pure, meaning the functions always return the same output, given the same inputs. There are no side effects allowed. You're not allowed to change the world around you, or certainly change the world outside of your function. Even writing to disk is not allowed. Even reading to disk is not allowed. It wouldn't be called pure because it's not always the same.
Who's Messing With My Coffee Cup?
This really led to one of my biggest aha moments, just seeing how terrifying passing variables by reference should be to everyone. Because when you see this coffee cup filling up, what you see really should be something like this, is, who's messing with my coffee cup? How do I make them stop? The point here is that it's very difficult to understand your code and to reason about it when anyone else can change your internal state. If you've heard of heisenbugs, this is the phenomena where even the act of observation changes the results. This is a classic, a hallmark of multi-threading errors, which is one of the most difficult problems in distributed systems. I'm trying to troubleshoot my coffee cup, and I can't get it to fill up again. In other words, I can't replicate the failure cases, which are failing in production, sometimes spectacularly.
Uncontrolled Mutation
In the real world, uncontrolled mutation makes things extraordinarily difficult to reason about and predict what will happen. John Carmack influenced so many of us. He was one of the founders of its software. He helped create Doom and Quake. He was formerly the CTO of Oculus VR. He wrote an amazing article in 2013, in the Gamasutra magazine about the power of using functional programming concepts in C++. He wrote about it in 2013, and he talked about it at the QuakeCon keynote. Here's his pragmatic summary. He says a large fraction of the flaws in software development are due to programmers not fully understanding all the possible states that their code may execute in. In a multi-threaded environment, the lack of understanding and the resultant problems are greatly amplified, almost to the point of panic if you're paying attention. For those of you who've read Brian Goetz's amazing book, "Java Concurrency in Practice," you probably shared my feeling of utter horror, realizing just how dangerous concurrency is, even when you think you know what you're doing, like I did. Personally speaking, when I read that book, I was horrified. I was left wondering how many hidden bombs that I put into production that could go off at any moment.
What we often find as software engineers is that state management is something that we have extraordinary difficulty with, especially if other people are allowed to change our internal state, maybe without even telling us, because in the real world, the universe that our program operates in, or the universe that your components operate in, is far vaster than just one coffee mug. In fact, if you zoom out, there are many coffee cups beside you, any one of whom can be changing your internal state, just because they have a reference to your object or your variable. Under these conditions, it becomes wildly difficult to know what the side effects of your operations are or what the side effects of someone else's operations are. One of the beliefs in the functional programming community is that uncontrolled mutation is at the very limits of what humans can reason about to understand and to be able to test.
There is a well-known duality of code and data. In other words, all functions can be replaced by data, and all data can be replaced by code, given enough space and time, of course. Everyone now knows that GOTOs are considered harmful, as stated by Dr. Djikstra in 1968. The duality of code and data also seems to suggest that uncontrolled state mutation should also be considered harmful.
The Problems That Still Remain
When I was writing "The Unicorn Project," there were certain things that I wanted to explore deeper, and one was the absence of understanding of all the invisible structures needed to truly enable developer productivity, the orthogonal problem of data. How do we get it to where it needs to go, which is in the hands of anyone who makes decisions? Strong opposition to support these newer ways of working and ambiguity of what behaviors we have to need from leaders to support such a transformation. In "The Phoenix Project," we had the three ways, the four types of work. "The Unicorn Project" had the five ideals. The goal was to really try to elevate concepts, I think are really important to help us get from here to there. The first is locality and simplicity. The second is focus, flow, and enjoy. The third is improvement of daily work. The fourth is psychological safety. The fifth is customer focus.
Cumulative LoC Written: Three Decades
The first two ideals really came from my learnings through learning Clojure. In October 2019, I wrote this 8,500-word long blog post, trying to summarize why I learned, and why those concepts are so important to me. Maybe just to paint some context, here's a diagram that shows the cumulative lines of code I've written over the years. The point here is that, that middle band, I went almost a decade without writing much code in my daily work. The reason why I thought the value was all in processes, not in the technology that we used, and I certainly changed my mind since then. The other takeaway here is that the majority of my experience was writing C and C++. I only wrote a couple hundred lines of Java. My point here is that, I'm going to make the case that me learning Clojure is like "Encino Man" learning to drive. "Encino Man" is of course the movie about the caveman, about someone who is frozen in time and wakes up in a completely changed, modern world. Learning Clojure without having any background in Java, I felt some anguish around because I got stuck in this cul-de-sac of C, C++, and Perl. I never got to experience the magic of common Lisp or Java, for that matter. My point here is that, if I can, anyone can.
How Clojure Made a Difference
Let me talk more concretely about how and why Java brought the joy of coding back into my life. I had spent 30 years with certain self-sabotaging habits, where I tended to blow up my own code. I actually got to see how dramatic a difference Clojure made in an application that I've helped write or co-write three times. In 2011, we wrote an application called TweetScriber, which allowed one to take notes at a conference and tweet at the same time, which is actually really helpful if you're writing a book. Flynn and Raechel, they wrote it in Objective C, which was amazing. It took about 3,000 lines of code. It kept working great until iOS 7, and then it totally broke. In 2015, I rewrote it in TypeScript and React, and it took about one-half the number of lines of code. Then in 2017, I took a stab at rewriting it in ClojureScript and Re-frame, and it was able to do it in one-third, again, the number of lines of code. It was amazing to me that Clojure runs on the JVM, or can be transpiled into JavaScript, and thus can leverage all of the open source ecosystems around an NPM and Maven.
Ideal 1: Locality and Simplicity
Let's go to that first ideal of locality and simplicity. Simply put, if "The Phoenix Project" had one metric, it was all about the bus factor. In other words, how many people need to be hit by a bus before the project, service, or organization is in grave jeopardy? In "The Phoenix Project," the bus factor was one. It was Brent. If Brent got hit by a bus, no outage could be fixed, and no major complex piece of work could be done without Brent. Obviously, we want a bus factor far larger. One, we want to be reliant on not one individual, but in a team, or better yet, a team of teams.
In "The Unicorn Project," the corresponding metric is the lunch factor, as measured by, to get something meaningful done, that's something you need to get done, done, how many people do you need to take out to lunch? Is it the Amazon ideal of the two pizza team, where every team that can be fed with no more than two pizzas can independently develop and test and deploy value to the customer? Or, do we need to feed everyone in the building? If you could think of large, complex deployments that spans scores of teams, you might have to be feeding 100, 200, even 300 people, maybe even for days. Or, if you're trying to implement a feature, and you are now dependent on 43 different other components, then you potentially have to take out 43 different people out to lunch. If any of them say no, then you can't actually get done what you need to get done. The reason why this happens is that functionality is now smeared across 43 different teams.
I really learned this lesson from Rich Hickey, who created the Clojure functional programming language. The two talks I would recommend to everyone is the Strange Loop Conference in 2011, and his JavaOne presentation that he made, 2015, to a sea of Java developers. That notion of the lunch factor really came from that JavaOne talk where he described when you are coupled to so many different teams that you actually can't get anything done. Crystallize the notion that, in the ideal, anyone can implement what they need to by looking at one file, one module, one namespace, one application, one container, and make all the needed changes there. That ideal is that you have to understand all the files and change potentially all the modules, all the namespaces, all the applications, all the containers, because, again, functionality is now smeared across that entire surface area.
Ideal is that changes can not only be independently implemented, but also independently tested, isolated from all the other components. That's the notion of a composability. Not ideal is that in order for us to get any assurance that our changes will actually work in production, we have to test it in the presence of all the other components. That's what often draws us to these scarce integrated test environments, of which there are never enough. They're never cleaned up, which actually jeopardizes all of the test objectives.
1995: Graduate School Compiler Project
To share the depth of this aha moment, I want to go all the way back to 1995. Just to share with you how long I've been making this mistake. It's in 1995, at the University of Arizona. I'm taking my graduate, high speed compiler course. We're supposed to build a Modula-2 compiler in C++, which outputted SPARC assembly code, which we would then compile into a SPARC executable. We had to write the lexer, the parser, generate the AST, the IR, and generate assembly code. This started out pretty great, because I had used Lex in the acts before. There was a specific feeling I had when we were going through the different phases of the project. I kept on eventually thinking, "Here goes nothing. I'm putting my code somewhere where I can no longer access it." It felt like throwing my code into a deep dark well, and just hoping that it worked.
Turns out, it did not work so well. We had test cases that would be run, which essentially would take Modula-2 programs that we just executed, and then check through results. Everything worked pretty well until the recursive test cases. It turns out that my program kept blowing up after a certain number of recursive calls. I think what was happening was actually marching through my stack backwards for local variables, but I ran out of time. I couldn't fix the mess I had created. I had to submit my fatally flawed compiler. I'm sure I had one of the lowest scores in the class. Believe it or not, 30 years later, while learning Clojure, I finally realized what I was doing wrong, my mistake was writing my code in such a way that I could not independently run and test each one of the phases. The good way is that you have each one of these phases that you can feed in an input for, and independently test. You tokenize the source files, and then you pass it to the abstract syntax tree. You generate the intermediate representation. You generate the assembly output. Then you finally output the file. You push the side effects to the edges.
What I was doing 30 years ago was I was taking each one of those phases and tucking it into the previous phase, so now I can no longer test each one of those phases independently. I was breaking the notion of composability, where you could independently run and test each one of those components. I think that's so important. The first ideal is about locality and simplicity.
Ideal 2: Focus, Flow, and Joy
The second ideal is about focus, flow, and joy. I'd mentioned this great quote by Claude Lévi-Strauss, is it a good tool to think with? Certainly, immutability, pure functions, composability are great tools to think with. I would also put in idempotence up there as well. What excites me so much is that these are no longer in the domain of just programming languages, they are showing up in infrastructure and operations as well. Docker is fundamentally immutable. If you want to make a persistent change to a container, you can't, you actually have to make a new one. Kubernetes takes that same concept, but not just at the system level, but at the systems level. Whenever you see something like Apache Kafka, someone's thinking about how to create an immutable data model. Same with CQRS. Version control is fundamentally immutable, which is why we get yelled at when we rewrite the commit history because we're not allowed to change the past. I think the outcome of using these better tools to think with is that in the ideal, our energy and time is focused on the business problem at hand, and we're having fun. Not ideal is that we're spending all our time trying to solve problems that we don't even want to solve. Things like writing YAML files, or trying to figure out how to escape spaces inside of filenames, inside of Makefiles.
Maybe one of the biggest surprises for me after learning Clojure is that there are all these things that I used to enjoy 10 or 20 years ago, that I now detest. Basically, I hate doing anything outside of my application. I become one of those developers. I hate connecting anything to anything else, because it always takes me a week. I hate updating dependencies, because when I do, everything breaks. I hate secrets management. I'm the one who always checks in secrets into the repo. I hate Bash, YAML, patching. I can't figure out why my Kubernetes deployment files are broken, or know why my cloud costs are so high. I don't mean to diminish any of these things. I think it's really to show how fussy I become.
This is one of the reasons why I'm such a fan of developer platforms. That everything that we do, whether it's monitoring, deployment, environment creation, security scans, orchestration, we can do through platforms, self-service, and on-demand, with immediacy and fast feedback, as opposed to opening of a ticket and waiting. These are the conditions that allow us to have focus and flow, which I believe create the conditions to actually have joy in our work.
When I talk about flow, I am referring to the amazing work of Dr. Mihaly Csikszentmihalyi. He gave what I think is the best TED talk of all time, "Flow, the secret to happiness." He wrote an amazing book called, "Flow, The Psychology of Optimal Experience." He describes flow as that amazing experience we have when we are so engrossed in the work that we love that we lose track of time, and maybe even sense of self. That transcendental experience we have when we are truly engrossed doing the work that we love, and that we're having fun. He goes on to say that there are really two types of learning. There's procedural learning, and then there's one-shot learning. Procedural learning are the skills we grow over decades that we love and appreciate. Every piece of new learning that we get, we appreciate because we know that's going to help us for decades into the future. On the opposite extreme is one-shot learning, these are the things like trying to figure out how to write YAML files. We did not wake up in the morning trying to figure out how to write correct YAML files.
Rich Hickey had a wonderful phrase for this, the goal is to solve problems, not to solve puzzles. He actually called the entire domain of category theory to potentially fall into this, a hallmark of many strongly typed functional programming languages. For me, puzzles are trying to figure out how to escape spaces in filenames, or write fast SQL statements. I've actually gotten in the habit of taking screenshots of my browser search history, just to remind myself of all the things where I spend hours trying to solve something that's very distant from the business problem I'm trying to solve, in this case, Java date, time instances. I'm not saying that time is not important. In fact time is absolutely critical. Trust me, I did not wake up in that morning, saying, I'm going to learn about time zones.
Summary
In "The Unicorn Project," my goal was to really highlight those comms that are required to get us from here to there. In the five ideals, locality and simplicity, and focus, flow and joy, I think are so important. I learned that through the journey of learning Clojure and functional programming languages. As a result, I am so grateful to Rich Hickey, and the entire Clojure community for helping me learn what I need to learn in order to write this book. It gives me such delight and joy that "The Unicorn Project" actually hit number two on "The Wall Street Journal" bestselling category in the hardcover business category. Clojure actually figures very prominently in the book. It delights me to no end, that business leaders whether they want to or not, are learning about functional programming, as they try to grapple with what digital disruption is and isn't, and how to get there.
Conclusion
The intended audience for this talk was anyone who is remotely interested in functional programming, and anyone who used to love coding, but they found that the joy of programming has faded over time. If I could wave a magic wand, anyone who hears this talk will be thinking, "If Gene can do it, anyone can do it." This will be even more fun than I ever thought. Two is that, I have actually felt less joy programming, and I'm now motivated to learn either Clojure, or for that matter, any functional programming language.
Resources
For people interested in this topic, don't miss the closing keynote. I was able to chair a panel with Mike Nygard, SVP of Architecture at Sabre, and Carin Meier, Data Engineer at Reify Health. These are two people whose achievements I admire so much, and I get to learn from them. What were the factors that led to their best peak experiences coding and their worst experiences coding, and really shared lessons learned in terms of how do you create an architecture so that every developer can be productive, have focused flow, and joy? For anyone who would like a copy of this presentation as well the list of links that I mentioned, as well as a link to all the excerpts of basically everything I've written, just send an email to realgenekim@sendyourslides.com, with the subject line of, Clojure, and you will get an automated response within a minute or two.
See more presentations with transcripts