BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations Tidy First? Kent Beck on Refactoring

Tidy First? Kent Beck on Refactoring

Bookmarks
46:19

Summary

Kent Beck discusses dealing with refactoring.

Bio

Kent Beck is a programmer, creator of Extreme Programming, pioneer of software patterns, JUnit, the rediscovery of Test-Driven Development, 3X: Explore/Expand/Extract, & the Tidy First? series of books on software design. Beck is also alphabetically the first signatory of the Agile Manifesto.

About the conference

QCon Plus is a virtual conference for senior software engineers and architects that covers the trends, best practices, and solutions leveraged by the world's most innovative software organizations.

Transcript

Beck: One T years ago, I took a look back at my career at that point, and tried to make sense of all these disparate weird things that I'd done, whether it was patterns, or JUnit, or TDD, or XP, like what does all of that have in common? It took me a while and I finally came up with a personal mission statement. I said, what I do is help geeks feel safe in the world. That's about the smallest statement I can make that encompasses all of the things that I've done in software development. What I'm talking about is a particular aspect of helping geeks feel safe in the world. It's a question that comes up all the time for us. I'm working with some crummy code, how much time should I spend cleaning it up versus just working with it? This is a fraught question. It's a complicated question. There's no stock answer that makes much sense. There are people who will tell you things that don't make sense about how you ought to approach that question. It's a source of a lot of conflict. A lot of conflict between people who are making software design decisions, between business people and software people. It's a source of that feeling of lack of safety. That's why I've been working on this particular project, which is around software design.

Updating Structured Design

In 2005, I was invited to be on a panel to celebrate the 25th anniversary of the publishing of this book, "Structured Designed," by Ed Yourdon and Larry Constantine. This is actually my college textbook, from way back then. I had this as a college text. Twenty-some years later, I was invited to be on a panel with Ed, and Larry, and a few other people to talk about the impact this book had had. I thought, it's about time I read the book. I started reading it. The more I read, the more excited I got. At that point, I'd developed a lot of software. I was talking a lot about software development. In this book are Newton's laws of motion for software development. These are the basic forces that shape how we develop software, and nobody was talking about it.

We're getting ready for the panel. I read the book cover to cover, really excited, even the parts about the debate between assembly language and these newfangled higher-level languages, and what are the tradeoffs between them, and designs using paper tape. I said, I would like to update this material. This is in 2005, 17 years ago. I did the math. I actually did it twice the first time, it was only seven years ago. I realized that couldn't possibly be correct. I did it again and realized I had been working on this project for 17 years. At the conference, I was able to have breakfast with Ed, who's now passed, and Larry, and just had a delightful time because they'd been classmates at MIT. They'd done a whole bunch of work together. I have it inscribed by these two jokers. The first one says, "Don't believe anything you read in this book, Ed Yourdon." The second one says, "...including the above, Larry Constantine."

I had the laws of motion of software development, as described here. The two key concepts are coupling and cohesion. This is the book that introduced those words. What I'd noticed, even back then, was that those words had drifted very far from their original meanings. They'd come to me in lots of different things, but "Structured Design" gives a very precise definition, which I will go through with you, about what coupling means and why it's important. I thought, I'm going to update this material. If you go back, you'll see some hotel in San Francisco, I gave a talk called, Responsive Design, that was my first attempt to get this across. I had some stuff right, but I had some stuff badly wrong. In preparation for today's talk, I was talking with my oldest who's now a staff engineer in software at One Medical. I said, I've spent 17 years figuring out how to explain cohesion. That's really the sticking point in all of this. That concept, the definition is very clear, but trying to explain it in a way that people can take it on board and understand it and make good use of it. That part I just went over again. The only way to learn how to explain something well, is to explain it badly, many times. It means that some of my friends when I sit down and offer to buy them a beer, just leave, because they know it's going to be another one of those bad explanations. Eventually, I get enough feedback. Now, finally, I feel like I'm ready to explain this, at least to a greater degree than I was before.

Three-and-a-half years ago, I decided, ok, I'm finally ready to write the book. I sat down to write the book. The first sentence that I typed was, software design is an exercise in human relationships. Exactly. I said, what does that mean? Why did it come out of my fingers? How am I going to explain this? This is one of the beauties of writing for me is I say things that I don't know that I think. If you can stop pressing tweet right after you do that, that's a really good skill to learn. I'd written that and I'm thinking, what in the world does that mean? I believe it. It feels true in my gut, but really, what all does that mean? What I'm going to do, I'm just finishing the first draft of the first third of the book. The secret about book writing, you're going to think, this topic is way too small to fit into a book. Then you're going to start writing it, and then you're going to realize, no, this topic is way too big to fit in a book. You're going to cut it down. You're going to think, that topic's way too small to fit in a book. Then you're going to write some more and you realize, no, that's way too big to fit in a book, and you'll cut it. The third time, you have about a book's worth of stuff. I've gone through that process, and I've drafted the last chapter of the first book.

Overview of Software Design

The topic of software design for me, as an exercise in relationships, divides nicely into three stages. What I'm going to do is give you an overview. If you want to know more about the specific stuff I've been writing, look for tidyfirst.substack.com. I'm doing an experiment in geek incentives. I thought, if I did this as a paid newsletter, then I would have the motivation to keep writing. Because what slows me down as a writer is not how slow I write, it's how often I don't write at all. It turns out that people giving me $7 a month is a social obligation enough that I'm going to keep typing. I have the first book about drafted. The book is called "Tidy First?" for reasons that I'm about to explain. I'm going to give you an overview. Then if you want to dive in more, that book is there. I'm going to continue writing the two subsequent books in the same place, so if you're interested you can go there.

Software design, what are we talking about? There's a basic loop in software development. Somebody has some idea, ok, the software does some stuff, and we want it to do some new stuff. We want to add a widget. Somebody has the idea, and now somebody has to change the behavior of the program, so used to calculate. Now there's a new option for this, and so we calculate some new stuff. This is conveniently a loop. We got an idea. We change the behavior of the program, and that gives us more ideas for what else we could do with the program, and all is well. Software is this magic stuff because it scales so well, unlike any other commercial endeavor. It scales to more of the people on the planet, to more of the planet, which brings with it obligations. It brings some obligations. It also brings with it scale, which is one of the exciting things about software to me, is I can take the output of my brain and spread the effects of that over wider areas.

We got this basic loop where we got ideas. We change the behavior of the system, that gives us more ideas. We change the behavior of the system, and away we go. Everybody is happy. Because we all know that underneath here, if we just keep doing this loop, idea to behavior to idea to behavior, we're going to start going slower, and bugs are going to come up, and programmers will get frustrated and leave. The new programmer is going to be even slower than the old programmers were. The structure of this system has a profound effect on how quickly we can go through the loop up above. As developers, sometimes we think, here's an idea, but before I just change the behavior, I'm going to change the structure first. Then I'm going to change the behavior and it'll be so much easier after I've improved the structure that I win already.

That is exactly the tidy first workflow, where you say, I have to change some messy code. I'm going to tidy it first. Then I'm going to change it. If I add the structure changes, and the behavior changes after the structure changes, that takes less time than just going and changing the messy code and leaving things even worse for the future. It's a dilemma. Sometimes it is. Sometimes it isn't. Sometimes you change the behavior, and then, "This is so ugly. Now if only it was designed like this, it would be easy." We've got this loop going on. One of the really magical things about software is sometimes the structure itself generates ideas for how to change the behavior. Now that it's easy to add a new one of these things, why don't we just add a bunch of them. You start doing the things that are easy to implement, because you've made them easy to implement.

The first lesson for me in software design is this hard split between behavior changes and structure changes. I want to make a very clear distinction between those two. Which hat am I wearing? I'm always wearing one hat or the other, just because the sunscreen and stuff. Which of those hats am I wearing? If I'm ever wearing them both at once, I'm making a mistake. I'm going to try and make this presentation practical as much as I can. If you wanted to act on this insight, that we should split behavior and structure changes, start making your commits, one or the other, but not both. Just try that for a week and see how it goes. Try it out maybe with your team, see how that goes. For lots of interesting reasons, structure changes can be treated much more lightly than behavior changes, because structure changes tend to be reversible. You extract some function, you inline the function. If you change the numbers, you report to the IRS, changing them back is a little squeegee.

Waiters and Changers

The loop is more complicated, and it becomes even more complicated. Here's where I'm going to finally get to this, software design is an exercise in human relationships. Because we got two kinds of people, I call them waiters and changers. I have the waiters up here. The waiters, maybe had the idea for what the software does next, but they can't do anything to change it. They have to wait. They're patiently, impatiently, a little foot tapping. They're waiting for the software to change. There are a different kind of people in this picture called the changers. The changers, this is the people mostly sitting here, where we know what to do to go in and change the behavior of the software. Already, we've got conflict, because you have waiters who want to see that next behavior change, they want the next feature. You got the changers, though, who know that if we just leave the structure to deteriorate, things are going to get worse. It's going to be less fun to work on, and more bugs, and more annoying. We're going to get further behind. We have a misalignment of incentives. Changers wanting to invest in the structure. Waiters in the short term, they don't even see the structure. It doesn't affect their daily life. They just want that feature as quickly as possible. We come to the first relationship. This waiter-changer relationship is fraught because of the divergence of the incentives, and the different vocabulary, different value systems, different wardrobes. Although I'm working on that. The third book in the series is going to be about using software design to encourage positive waiter-changer relationships.

We can zoom in now. There's another set of relationships that software design either contributes to or inhibits. That's the relationships between the changers. If we have a bunch of changers, and they're all related to each other, and somebody is producing an API that somebody else is consuming, and they want to change the API, or they want to change the implementation in some way that would cause a change for somebody else. Now, all of a sudden, you've got, again, a divergence in incentives, where one person's best interest is to make the change, and another person's best interest is for the change not to be made, or not yet, or not in exactly that way. The second book is about exactly this set of relationships. We all have a greater alignment of value systems, and vocabulary, and clothing choices. Yet, there is divergence of incentives. Oftentimes, the things that hang teams up, it's not, this was refactored in in this way, and so technically, it doesn't work anymore. It's more, things were refactored in this way by this person who was acting like a jerk. Then you have real problems. Which is why I say software design is an exercise in human relationships. I realized, as soon as I typed that, that people were going to freak out. People who don't have a lot of confidence in their abilities in human relationships. This is just what it is. Software design has a critical role to play in changer-changer relationships. Software design skills applied in a certain way have a critical role to play to keep these relationships strong, to mend these relationships when they've been frayed, and to keep everybody moving forward.

Tidy First

I told you I started out with this big topic and thought it was too small, and then I chopped it, and then I chopped it some more. Where I finally got to this book, "Tidy First," is really focused on individual programmers, or pairs, or a mob. It's all the same thing. The question that comes up 20 times a day for everybody who is touching code is, this code is messy. Changing it is going to be harder than it needs to be. Should I tidy first? That's the basic question, comes up over again. You're going to get some dogmatic answers, "Of course, you always tidy first." Because that's a simple answer I don't have to think about anymore. I think probably that's the explanation of it. Or you also get the, should I tidy first? Absolutely not. Tiding is a waste of time. Why would you do that? We have waiters screaming for the next feature, get it done as quickly as possible. Don't bother wiping the knife between cutting meat because salmonella happens outside of the restaurant. It's kind of like that, but that's that dogmatic answer. What I discovered when I looked at this tiny little grain of sand question, should I tidy first? The answer of course is it depends. What it depends on is more or less all of software design. All of the factors that play into software design at the largest scale, also play into this question of, "I got this code, it's a little bit messy. I have to change it. Should I tidy first?" That's what this first book is about.

We have this question, we have some messy code, should we tidy it first? It depends. What does it depend on? We've got these dogmatic answers: always and never. They don't make any sense. They don't make any sense for particular economic reasons. I was not somebody who naturally understood money. I have friends who are traders, and they have a real gut feel for money. It took me a long time to get to that same place where I'm comfortable with money. There's some of the effects of money, like the laws, Newton's laws of motion for money. They really do exist. I didn't know them. Once I learned them, then I took a different look at software development. I'm going to talk about two of the whys of money that affect software development, and these two conflict. If somebody comes and says something like, here's something, may you live long enough that people start doing the dumb things you did when you were young, unironically. Because that's what's happening to me now.

The good news is, I don't have to invent any new topics. I can just go into XP Explained, open up a random chapter. Give a talk based on that chapter, and people will go, "How'd you learn that?" It's really the same thing. This, I hear people now explaining to me patiently, "We have to make all these design decisions. If we just made them at the beginning of the project, all the rest of the project would go much more smoothly." It is everything I can do not to just go full, get off my lawn, grumpy old man. I used to hate it when, as a young engineer, some old engineer would go, I've been doing this since before you were born, kid. It feels so good to say that. Waterfall is back. Controlling time, scope, cost, and quality is back. Just the whole thing. You're already smarter than most of the people out there, just guaranteed by not being pulled in by these discussions. Comprehensive documentation, just watch me go into orbit.

Design Upfront

My point was about design upfront. There are perfectly good economic reasons why design upfront is a bad idea. I had to learn about discounted cash flows, and really internalize that knowledge, and then design upfront just makes no sense, economically. I'm not saying what we do all boils down to economics. We are engaged in a profoundly human activity that pushes us to the limits of our abilities to relate to other human beings. If it doesn't make economic sense, it doesn't matter how well you get along. That's why, for me, the foundation of this is, how can we make software projects that make better economic sense? We got the time value of money. What that means is, if you tell me about how much we're going to spend, or how much we're going to make, I have to ask you when. Here's why. If we have time going along this axis, and we spend some money here in order to make some money there, the magnitude of these cash flows, we can't evaluate them just by comparing the sizes. Because, this money that we spend, looked at from today, it's actually a little bit smaller. Future money gets smaller when looked at from the present. If we look at this revenue from the future, it's going to have longer to decay. Yesterday, I blew my mind, I realized, discounted cash flows is just half-life as applied to money. You got some money in the future, and then you want to look at it today, it just gets smaller as it gets closer.

The problem with a design upfront project, is you're spending all this money now, which doesn't get discounted very much, in order to make a bunch of money at some distant future date. That's going to be discounted, much more substantially. It can look like it's a really profitable project, and turn out to actually be a disaster. For example, if we have a project, this is the upfront project, where we're going to design, design, design, and then, everything's going to be fantastic and we're going to make a whole lot of money. If we can transform this project into this project, we've made progress. I came up with a great phrase as I was preparing for this: spending less and spending later are exactly equivalent. In this second version of the project here, because we moved our spending out into the future, we're already making more money. It's not just about the magnitude, it's about the timing of the expenses and the revenue. The problem with design upfront isn't that we're going to make a bunch of decisions on speculation, which are going to turn out to be wrong. Then we have to carry the burden of all these decisions along with this, or we have to get rid of them and remake them over again. All that's true, and not the point. The point is, we're spending too much money too soon. If we can defer some of those expenses until later, we've created economic value. The purpose of the style of software design that I'm advocating here, is to make money sooner and with greater certainty, and to spend money later, and with less certainty. It's not just about, we're going to make more. Because the absolute magnitudes aren't so important as the timing of the software design decisions that we make.

Optionality

Time value of money tells you, spend money later, earn money sooner. There's another force at work though because we work in an area with great uncertainty, and we don't know what we're going to ask our software to do next. If we can increase the optionality of the software, then we have also created economic value. If I have software and it can go this direction, this direction, or this direction, and this one makes a lot of money, and these two only make a little bit of money. When I get to this decision point here, I'm going to say, now I can see I'll make more money here. If I don't have this option to go in this direction, this software is worth a whole lot less. The counterbalancing force to discounted cash flows is optionality. These two come into conflict, because the optionality we create today, we haven't exercised it. We've spent money. If only we had pulled the software apart this way, so it's easy to replace this thing with other things, that's optionality. We're spending money today to make more money later. Discounted cash flows pulls us in one direction. Optionality pulls us in a different direction. We haven't even gotten to coupling yet, no wonder this is hard.

Coupling

Let me talk about coupling. Software systems are constructed out of elements. I just say elements generically, because it doesn't matter to me. My oldest, when they were learning how to program, came to me after a while and says, architecture, design, coding, isn't it all design? It is. When I talk about design, I just talk about elements. The elements can be itsy-witsy, like expressions in a statement, or they can be gigantic, like services in some mesh. You have elements, and the elements are related to each other. A thing that happens frequently in software, is I go to change this element, and I realize, "If I change this, I have to change this and this too." What just happened? What we thought was going to be a cheap change just became more expensive. If that's as far as it went, it'd be annoying, but it wouldn't be disastrous. It gets disastrous because those ripples continue to flow further out. This is the observation made in the "Structured Design" book, that there were certain systems which were cheap to change, this is the early days of IBM, and there were other systems that were really expensive to change. The difference between the two was that the elements transmitted change, and the expensive systems, the elements of the system transmitted change to each other. They were coupled.

The definition of coupling, if I have two elements, E1 and E2, and I have a specific change that I want to make, some delta, this is defined as if I change element one, that implies I have to change element two also. That's the definition of coupling. Colloquially, people will use the word coupling to mean all kinds of different things. "This service calls that other service so they're coupled." We can talk about that relationship and it's a different one. This is a very specific one. Coupling is a very specific relationship that says, if I change this element, I have to change that element too. If I change the name of this function, I have to go to all the callers and change them because they're coupled with respect to changes to the name of the function. They aren't coupled with respect to the formatting of it. If I go put a comment in the middle of the call function, nobody cares. They're not coupled with respect to that change. They are coupled with respect to changes to the name. Why does this matter? It's because of these ripples, and you get these jackpot effects. I'm a huge power law nerd. I love finding power law distributions. It's a jackpot situation. If you go and you make changes to the behavior of the system that seem about the same size, and you look at the distribution of how much work each of them is going to be. Most of them will be about the same size, and half of them will be twice as much work, and half of those will be twice as much work. Then way over here is the one that caused the CTO to quit. We live in a natural world in software, a natural world we're not particularly aware of till now. That's what's going on.

I call what follows, Constantine's equivalent. What Ed and Larry observed in "Structured Design" is that the cost of software, approximately equals the cost of change. That is, we don't really care about the "initial development" of the software, because if it's at all successful, it's going to live for a very long time. We're going to spend almost all the money changing the software, not on that initial tiny slice of an initial development. We can refine this further and say, the cost of change is approximately equal to the cost of the big changes. This is this power law, long tail distribution, where if we add up the handful of really expensive changes down here on the long tail of it, and we compare the cost of those together with the cost of all the rest of the cheap changes, most of the cost is going to be in these big jackpot changes. The cost of those big changes is really the cost of the coupling of the system.

We can say that the cost of the system approximately equals the coupling of the system. Plus, we're software designers. You can decouple stuff, but it costs money. The cost of the system is approximately equal to the coupling in the system plus the cost of the decoupling that we do. Now we're in a tradeoff space, where we can say how much this coupling cost us, how much this decoupling cost us. Can we get into some sweet spot between the two? That's the primary message of this work for me, is that we all have to make this tradeoff between coupling and decoupling. If we just ignore decoupling, if we ignore software design, we're going to have more of these jackpot changes. The problem with jackpot changes is they destroy trust between the waiters and the changers, because the waiters don't want to wait. They said, "You've added 14 widgets, each one of them has taken a week, and this widget is taking me 8 months to add, like, are you idiots?" No, we're just working in a natural system. If we say, to find out, we're going to refactor everything. Now the waiters are like, ok, and then what do I get at the end of all this refactoring? You say, exactly the system you have today. That's not a relationship positive move.

Conclusion

To bring it back, full circle, we all have the option of doing software design in ways that enhance our relationships with ourselves, with our immediate peers, and with people with different perspectives. If there's one message I would invite you to take away, it's that you can always make big changes in small, safe steps. There's one skill to master, it is that. The more you can make your big changes in small, safe steps, the less interruption you have in your relationships with other people. The more it looks like those features just come out, even though you know that under the water, you're continuing to evolve the structure as you go along.

 

See more presentations with transcripts

 

Recorded at:

May 26, 2023

BT