Bio Joshua Kerievsky has been programming professionally since 1987, and is the founder of Industrial Logic, a company specializing in Extreme Programming (XP). Kerievsky is the thought leader behind Industrial XP, a state- of-the-art synthesis of XP and Agile Project Management. He has been an active member of the XP and patterns communities, he wrote a book called Refactoring to Patterns.
Lean Software and Systems Conference 2010 — the place to learn about Lean, Pull Systems and Kanban. Understand how established industrial engineering theory can apply to software development process. The conference will assist organizations that depend on software – from start-ups to those that build complex, software intensive products, systems & services – with the application of Lean Thinking throughout the enterprise.
1. I’m here with Joshua Kerievsky, the founder of Industrial Logic. The title of your talk here at Lean Software and Systems Conference that we’re at is The Limited Red Society. Could you tell us a little bit about what’s that all about?
The Limited Red Society is kind of a play on words. There is the limited WIP (Work-In-Progress) society that some folks put together last year. Work-In-Progress is a wonderful term, limiting that Work-In-Progress that Kanban folks and the folks at this conference, the Lean Software and Systems Conference, are very focused on. The thing is they’ve been focused at the planning level. So everyone who’s talking about limiting WIP is looking at limiting the number of story cards you have in progress ("in-flight"). I was looking at it and saying "That’s wonderful and we’re doing that in the software world, too, in programming, and it’s critical". Because there are two sides: There’s planning and there is the tactical stuff. You got to get both right, if you really want to be flowing value to your customer regularly. We say "You’ve got to limit the amount of work you’ve started and have in progress, and you have to limit the amount of time you’re in the red." If your system is up on blocks, as we say -- if it’s not compiling, the build is broken, you’re working on a defect and the code is all over the place; it’s a mess. Or we’re out in several branches and it has to be merged before it really can be shipped or "Forget it, we can’t ship it!" - all those states are where you’re in the red. You can’t ship. We say you’ve got to limit that period of time. Limiting red, from a visual point of view, for a lot of us who are using tests to help us know if the system is ready to ship, if the tests are red, if there are any red tests, we’re in the red. We have gone further; we’ve said "Let’s look at compilation errors." So we’re tracking those in various environments and if you have compilation errors you’re in the pink. Pink is a form of red for us, when we are graphing these things, we’re starting to make programming performances visual, so that you can do your programming, say "OK, let me save that session uploaded and see a graph." The graph could contain red segments or pink segments, both of which we consider to be in the red. Pink is a shade of red. This is a way to help us get better at learning how to stay in the green. It doesn’t mean get rid of red. Obviously, when you do test driven development, or BDD (Behavior-Driven Development) there is a point in time when you want to be in the red. However, you might want to have that be a limited amount of time. You don’t want to create seven or eight failing specifications or failing tests and then go to make them all pass after three or four hours. That’s just not the way we do it. Let’s write a little one that fails, let’s get it to pass, let’s refactor and keep that process going. But if you look at the performance, let’s say you recorded two hours of your programming, you could actually ask yourself "What percentage of time was I in the red?" Right now what we’re doing is calculating statistics on that. We think at Industrial Logic that we need better numbers on programmers. If you talk about a baseball player -- and this is something I got inspired by reading the book "Moneyball" -- they have all kinds of statistics on them: betting average, average time they get on base and all kinds of elaborate data about a baseball player. When we look at programmers, what kind of data do we have? We really don't have a whole lot to distinguish a great programmer from a mediocre programmer. We think that we need to start collecting stuff like that, so that programmers themselves can get better. They can say "I don't want to do that because that would put us in the red for a long time and my average will go down." It's just not the right way to do it. There is a better way for us to refactor that code or a better way for us to proceed with that part of the design. Visualizing the work at the programming level in terms of trying to find ways to limit the time in the red, we think is very important.
2. There are two things that you said in the last statement. The first being the implication, which is very interesting, that the ideas of limiting Work-In-Progress translate pretty closely to a developer. A developer having their code in a state where it can't compile or it can't pass the tests is one-to-one to being Work-In-Progress. Secondly, you're talking about the ability to have developers where the programming community itself would be able to have an objective mechanism to be able to do assessment of themselves from a programmatic sense.
Right. It's a challenge. Because any metrics we come up with, either they will be able to be gamed -- people will be able to game the metrics, or they will only be showing one dimension. You could have fantastic process metrics. You could say "This person limits the time in the red beautifully. They're only in the red 10% of the time on average. After 300 hours of recording we see their average time in the red is very, very low. Fantastic! But how is their code? How well designed is it? Does it make the customers happy? Is it riddled with defects?" There is more than one dimension here. However, we don't think we've been looking at any of these dimensions very closely at all. In Moneyball, it used to be in the old days, the scouts who would go scouting for new players, they would do their work largely based upon appearances. You know, "This guy, he looks like a great athlete. This other guy is tall and skinny. He doesn't look like a professional baseball player, so I'm not going to really bother with him. But this other dude looks really athletic." Statistically, the tall skinny dude could be a way better ball player. And that's what Moneyball was saying and it changed the game of baseball fundamentally. It was a painful transition for those scouts because they were used to doing it their way. We've got to get better at this because it should inspire programmers to have great metrics on themselves and learn "How could I get better?" Versus: "It's all just invisible. I have no idea who's better than whom. I got no place to go, besides certifications -- "Oh, look, he has a certificate, he doesn't." That is sort of a lame way to distinguish talent.
3. There is currently a heightened degree, as of late, of interest in the community of finding ways to take assessment to the developer level. Do you see what you are talking about? Do you see this capability as fitting into that space and maybe solving any problems there?
Yes, I think that we are not interested in certification at all at Industrial Logic. We never have been. That's not because we’re purists or something like that. It’s just because we don't see good stuff coming out of certificate programs. You go to a two-day course, you get a certificate. It may be an excellent course, but going around and putting on your email signature that you've got a Master's degree in something and you're a certified "blah" after taking a two-day course, it doesn't say all that much. We'd rather have meaningful metrics that are accumulated over time and give a good picture for your capabilities. And that's hard to do. But, ultimately, we think it's a better way to go. What we're doing in The Limited Red Society is trying to foster that kind of an approach to long range assessment of skill, not certification.
5. Would you agree that you've in some ways already used somewhat quantitative metrics, for lack of a better term, to be able to influence your even personal adoptions of things like improving Test Driven Development and refactoring and baby steps. Test Driven Development in its essence is a mechanism to try to limit a programming session Work-In-Progress.
Absolutely, somehow it keeps you honest, when you are recording some primary data to see how good are you at this process stuff. Again, I don't want to oversell and overhype it, there is how good is the code. That's a whole different area and it's something we are interested in. The other day we ran a little tool to detect dead code. I was working with Naresh Jain in India on that and there was an exercise we created for developers. It happened to have piece of dead code in it that we put deliberately to see if they found it and got rid of it. And this tool found that dead code plus another piece of code that was dead that we weren't aware of. That's an example of where tools can see things that we sometimes can't see with the human eye or the tools we're using. Ultimately, we'd like to combine some of the work we're doing looking at process metrics with design metrics to see how good is it all unfolding.
6. You’ve spoken in the past about the way you operate in IL [Industrial Logic] and some people might correlate some of your practices to what David Anderson and company are talking about with Kanban. Could you talk a little bit about that?
We've been doing the Agile, Extreme Programming, and Lean and Agile stuff for over 10 years. We’re a small company; we have a limited number of people that work on things. So we've been pretty good of getting rid of parts of the process that are not adding a lot of value. That's an easy thing to say, but it's harder to do in practice because you really have to say "We've been doing this thing for five years. Is it really paying for itself? Is it really helping us? Well, let's try an experiment! Let's not use it for a little while and see if the process breaks or the process gets better." We've been doing that routinely, constantly questioning what we're doing in terms of process. We've gradually gotten rid of a lot of things and it's surprising to a lot of people. Most of them say, "Yes, now it's kind of old news to get rid of estimates." As David Laribee said to me. "You guys still do estimates. They are just called ‘yestimates.’ So you say either we can do it or we can't do it." If you are releasing once or twice a week or moving into the world we are, which is continuous releases, where you check in code and it goes live. You are releasing all the time. How does the process fit, how does the process change as you move into a continuous flow of releases? For us, we stopped using a visual or a planning tool. We used them for years, but we don't use one now. We literally don't use one. We have people all over the world helping to produce the work we're doing. How could we possibly maintain order in such a world? We just communicate a lot. We use Skype religiously, we talk to each other. We have very small batch sizes. In other words we work on very small things and get it out the door and keep doing it. As long as we can all agree on what that small thing is -- and we do that with Skype, we do it with some emails -- we know what we're working on, we do it and it goes out the door. We found that planning tools were slowing us down and we’ve tried going back to them and saying "It's been a while since we used the planning tools, let's try one." The team rebelled. The team just said, "We don't feel like doing this work. It just feels like overhead." It feels like overhead when we change our minds and we change our minds a lot. We can start something on Monday, something critical comes in on Tuesday and we completely veer in the direction of the critical thing and we don't want to have to go and update a bunch of stuff on a tool. It just slows us down. It wasn't adding value. We got rid of the tools; we got rid of formally declaring user stories. It just didn't become necessary any more. "Let's talk about what we want to do next." No user stories, no planning boards, certainly no estimates. We don't have iterations of course, so we just decide what we want to work on and what's the soonest we can deliver it. Usually it's a day or two from when we begin the work. That's what we do. I'd say "Is it Kanban?" Well, I don't think you call it Kanban. It's more of a flavor of Lean. It’s very, very Lean; it's ultra Lean. It's not incredibly quantitative, which is a big difference. However, there is a lot of quality built into our process. If we notice a problem in an area we don’t really get to look at a graph to see what's happening necessarily, but we're aware of problems. We're on top of the problems. If we see too many defects starting to come in we say "Something's wrong here folks. What's going on?" We immediately discuss it and we take action. But because there is some overhead in doing the work to produce the quantitative metrics in scenarios, we don't feel that is valuable. As Don Reinertsen said in the keynote here, you've got to bring Lean and economics together. You’ve got to bring the economical picture in and say "Should I become a fantastic speller or should I rely on the spellchecker?" The cost of relying on the spellchecker is so cheap that I may as well do that. For us, it may seem like the right thing to do is always have the quantitative data and therefore always use a tool, but it doesn't always work out that simple. You have to weigh the costs. I think we're doing things fairly well; there are metrics I would still like to see. We don't have a very large collection of defects so we don't have a defect system. We don't have a system that tracks defects because there aren't that many. When there are some, we usually attack them. If there are some we don’t' care about, they just linger until we hear the second or third person complain and then we'll do something. It's gotten onto our radar again. But I don't need to see a graph that shows me shrinking defect ratios or defect numbers because there just aren't that many. It's extremely context specific. This isn’t necessarily what we recommend to our clients because they are in different worlds, but for us, that’s how we’ve evolved. And I think it reflects maturity when you see your process getting more and more Lean, getting rid of things that you just don’t need.
7. It sounds like to me one of the fundamental themes that is in what you are describing is that you’re finding ways to limit Work-In-Progress, to limit queues, limit batch sizes, even all the way down to the fact that there is no defect tracking queue, which allows you to stay adaptive and Lean in that.
Exactly. I think that what you have to do ultimately is look at what is the objective and then see how you want to get there. We can become extremely Lean and extremely mature in our process, but we’re not doing it how so-and-so told me to do it; we’re doing it our way. There is nothing wrong with that at all. I don’t want to sound like I’m hypocritical here saying "We’re not using the visual tools for the planning side, but we are using these tools to gain statistics on the programming side." The fact is that the technology we use to record data for programming is just behind the scenes. You don’t have to do any work. You’re programming and if you are recording, you’re recording; you’re gathering data, like the spellchecker. The cost of collecting that data is nothing. That’s what we like. We’re pretty lazy that way.
I think the biggest trend is that we’re moving more and more towards delivering high quality services on all kinds of devices and in very heterogeneous environments. We’re trying to get rid of these monolithic projects and think of them more in terms of services that people can consume. And the trend in the area we’re in -- at least in process improvement -- is much more towards mature Lean prophecies. We’re looking at quantitative metrics, we’re trying to find ways to see, as Eric Ries says, validated learning. I love some of the concepts coming out of the Lean startup community because it’s pushing Agile to a place that is really focused on customers and on knowing whether you’re building the right thing, on risk management, on all those really good things. I think the trend is going even more towards making happy customers and not wasting time building a product that no one wants to use. That’s really pathetic. So there are two things. You can build a product that you think people want to use. Maybe they do use it, but it’s got horrendous technical debt and you can’t keep up because of all those problems. Or it’s got clean code but no one wants it, it doesn’t really solve the customer’s needs. The processes are trying to push us those towards much better interaction with customers, much better experimentation driven design. Where we’re experimenting, doing the A/B testing out there. All those good things that we’re gathering data about what is really needed. Then hopefully programming in such a way that we’re able to get little pieces out the door regularly, flowing value to the customer. So it’s a flow of value and it’s trying to keep the quality sufficiently good. I don’t say you don’t want perfect quality, you don’t want total junk. You got to find that balance in there where you can continue to flow value out and make sure it is what your customers want. There are some cool things going on.
9. You are the creator of a flavor so to speak of Extreme Programming called Industrial XP. You’ve been talking about things like this for a few years. Do you see the popularity of Lean concepts and what people are beginning to understand in terms of systems thinking? Do you see this tying back to what you were hoping to achieve with Industrial XP.You are the creator of a flavor so to speak of Extreme Programming called Industrial XP. You’ve been talking about things like this for a few years. Do you see the popularity of Lean concepts and what people are beginning to understand in terms of systems thinking? Do you see this tying back to what you were hoping to achieve with Industrial XP.
I definitely think that any time you create a collection of practices or values and principles you usually have to give it a name. For us, in those days, it was the early days of extreme programming. Around 2002 we had enough new stuff that we were regularly using with clients that we said, "Instead of having this all be a collection of papers and collection of articles, let’s coalesce it into something with a name that will make it easier for people to digest and use." That’s really the ultimate reason to do something like that. Over time I’ve seen a lot of the ideas from Industrial XP being adopted widely in the Agile community, which is really wonderful. For us, the things we’re seeing in Lean are somewhat reflective of what we’re doing in Industrial XP, but going beyond it as well, which is very exciting, seeing improvements. The Kanban movement I think is a very healthy one. I think people deserve to look at that work, especially David Anderson’s new book on Kanban. It’s a quest to just get better and better and better. There is no reason to stop with the process you have today, keep looking for better ways to do things. This conference here has been one of my favorite conferences that I’ve been to in a long time, the Lean Software and Systems Conference -- excellent speakers, excellent topics and a lot of people really trying to push process into a much better place than just iterations and stories and velocity.
At the balance between qualitatively and quantitatively managed work that ultimately produces a great result for your customers, for you own business - that’s the magic ticket.
We want to make it available to the world, so that programmers around the world can start to record their programming sessions and have a sense for how good they are in terms of the time spent in the red, spent in the green and various metrics like that. It’s a great challenge because we want to make it available on all kinds of platforms, from a C++ programmer working on the command line to someone using C# or even Mono, Rails and Ruby - that kind of thing. There is a lot of work to be done, but we’ve made some good inroads in languages like Java, C++ and C# and we’d like to make this be a society that people want to join and be a part of so that they can improve and be inspired to improve.
Re: The MP3 link is broken!
Sorry about that - the link's been fixed up, you should be able to download it as expected now.
Thanks for the Full Page Transcript page
It seems possible, but difficult
But since developer's job is totally mentally action, and each person's job is so different, the job is seldom repetitive, it is really hard use standard metrics to measure. I also believe since developer's work is non linear, some kind of unpredictable,some kind of creative, which make it is harder to measure. Just like I can't imagine if there is a metric system to assess artists. So I doubt this idea will work, either because it is not workable, or because performance metrics itself is more difficult than developer's job.
Why does time 'in the red' matter?
Are customers paying for limited developer time 'in the red'? Is this something that your customers are asking for?
It just seems strange to not capture any metrics on the value you provide but to put all your effort into capturing metrics that you can't tie to value.
It seems like we've gone full circle back to when no one tracked the value of what they were doing but tracked low level metrics that don't mean much.
Re: Why does time 'in the red' matter?
I'll address your concerns one at a time.
1. We need to learn what is valuable to record about our programming performances, the usage of our software and user happiness. Limiting time in the red is one of many "recordings" we are interested in. It is a convenient place for us to start, as we happen to teach folks TDD, and limiting time in the red is important to good TDD. Recordings about feature usage ("Is anyone even using that?") are also important.
2. Customers pay for software, hopefully software that meets their needs and isn't buggy. If you limit your time in the red (i.e. your tests run green, your build isn't broken, etc.), you can ship improvements or bug fixes fast. We do that. In fact, we now do Continuous Deployment (check in, run all tests, deploy to prod and do final sanity checks automatically to determine if we need roll back). CD allows us to continuously ship better software to our customers...and we could not do that so well if we spent loads of time with un-shippable code.
3. I'd love metrics on capturing how often features are used, bugs reported per feature, average user speed to perform a common task, user satisfaction, etc., etc. You seem to want to conclude that by talking about limiting time in the red, I'm therefore not interested in any other metrics. That would be incorrect.
4. We likely differ on whether tracking time-in-the-red is valuable. In our context, which is an XP/Agile one, we find it valuable to learn how to work in very small batches of work. We're also quite interested in other valuable metrics and we certainly know the dangers of metrics and the tendency for them to be games or abused (e.g. 100% code coverage).
Hope that helps clarify.
Re: It seems possible, but difficult
Hope that helps clarify our intention.
Re: Why does time 'in the red' matter?
2. Customers pay for software, hopefully software that meets their needs and isn't buggy. If you limit your time in the red (i.e. your tests run green, your build isn't broken, etc.), you can ship improvements or bug fixes fast.
This is the assertion I'm questioning. How do you know that limited time in red improves your overall speed? Have you done a regression analysis? What was your R-squared and adjusted R-squared? If you aren't tracking defects (I believe you implied that you are not) how do you know you don't have many? How are you testing the hypothesis above? If I missed it I apologize. I just get the feeling this is an article of faith.
Part of the reason I question this is that probably 95% (if not more) of the time my code doesn't compile is because I have chosen to have it not compile. Just last week, I decided to do a major refactoring of a code based I've worked on for years. I refactored for probably 5 hours straight and a good portion of that time the code did not compile. I didn't test or try to run the program during that time. At the end, when I had only 2 minor problems with my changes. I wasn't flailing. I knew exactly what I was doing and why. If I had done this same amount of effort while limiting my 'time in the red', I would never have been able to do so much in such a small amount of time.
After that, I did pursue a 'limited red' strategy because I was changing the functionality. The more risky changes required more methodical efforts.
The point is that it seems like you are making a logical leap that isn't necessarily true.
Re: Why does time 'in the red' matter?
Re: Why does time 'in the red' matter?
I'm not saying that "Limited Red" is the only measure of a programmer's ability. There are many ways to measure ability. Limited Red is an important one, not the only one.
Re: Why does time 'in the red' matter?
Regarding your refactoring session, and how you maybe "would never have been able to do so much in such a small amount of time" without being red for hours - I'd suggest watching Joshua's presentation that goes into greater detail on this topic, particularly the demonstration of staying green while refactoring.