BT

Does lines of code kill?

| by Niclas Nilsson Follow 0 Followers on Dec 20, 2007. Estimated reading time: 5 minutes |
Steve Yegge touched a nerve in the development community when he posted his latest blog post. Steve argues that keeping the code size to an absolute minimum is the most important thing when developing software. In his view, you may have to sacrifice some design patterns and avoid refactoring at times just to keep the lines of code down. And if your problem is large enough - you may have to switch to another programming language.

… I believe, quite staunchly I might add, that the worst thing that can happen to a code base is size.

According to Steve, code size kills:

My minority opinion is that a mountain of code is the worst thing that can befall a person, a team, a company. I believe that code weight wrecks projects and companies, that it forces rewrites after a certain size, and that smart teams will do everything in their power to keep their code base from becoming a mountain.

Steve says that he could have called it code bloat, but the problem is that developers does not recognize bloat, and he’s not really talking about the common accidental complexity.

I say “size” as a placeholder for a reasonably well-formed thought for which I seem to have no better word in my vocabulary. I’ll have to talk around it until you can see what I mean, and perhaps provide me with a better word for it. The word “bloat” might be more accurate, since everyone knows that “bloat” is bad, but unfortunately most so-called experienced programmers do not know how to detect bloat, and they’ll point at severely bloated code bases and claim they’re skinny as a rail.

The background of the blog post is that Steve have written an online game in Java that is now 500.000 lines of code and due to the size of the code base, he can no longer maintain it himself. A while ago, he took the game down and he’s currently rewriting the game in Javascript.

I say my opinion is hard-won because people don’t really talk much about code base size; it’s not widely recognized as a problem. In fact it’s widely recognized as a non-problem.

But how about tools? Can’t tools can ease the management of the code?

People in the industry are very excited about various ideas that nominally help you deal with large code bases, such as IDEs that can manipulate code as “algebraic structures”, and search indexes, and so on. These people tend to view code bases much the way construction workers view dirt: they want great big machines that can move the dirt this way and that.

Many developers will agree with Steve so far. People who have been introduced to a large code base knows that lines of code can be painful in itself.

If you have a million lines of code, at 50 lines per “page”, that’s 20,000 pages of code. How long would it take you to read a 20,000-page instruction manual? The effort to simply browse the code base and try to discern its overall structure could take weeks or even months, depending on its density. Significant architectural changes could take months or even years.

But Steve becomes more radical than most developers and suggests that it may be a good choice to avoid design patterns and refactorings in order to minimize the code base.

The problem with Refactoring as applied to languages like Java, and this is really quite central to my thesis today, is that Refactoring makes the code base larger. I’d estimate that fewer than 5% of the standard refactorings supported by IDEs today make the code smaller.

And about design patterns, he writes:

And design patterns – at least most of the patterns in the “Gang of Four” book – make code bases get bigger. Tragically, the only GoF pattern that can help code get smaller (Interpreter) is utterly ignored by programmers who otherwise have the names of Design Patterns tatooed on their various body parts.

A little while ago, InfoQ summarized a debate about dependency injection, and Steve puts DI in the code bloat camp as well:

Dependency Injection is an example of a popular new Java design pattern that programmers using Ruby, Python, Perl and JavaScript have probably never heard of. And if they’ve heard of it, they’ve probably (correctly) concluded that they don’t need it. Dependency Injection is an amazingly elaborate infrastructure for making Java more dynamic in certain ways that are intrinsic to higher-level languages. And – you guessed it – DI makes your Java code base bigger.

Bigger is just something you have to live with in Java. Growth is a fact of life. Java is like a variant of the game of Tetris in which none of the pieces can fill gaps created by the other pieces, so all you can do is pile them up endlessly.

In the comment section there is a lively debate. Many people think that the solution is to break things up into libraries and thereby avoiding to need to understand all the code, at least at once. Udi Dahan asks:

Suppose you structured your code in such a way that you only had to look at under 1000 LoC at a time in order to do useful work. Would the 500KLoC be as big a problem?

Jay Levitt steps in, disagrees with Udi, and he coins the term stratification to describe what he means.

There’s an anti-pattern I keep seeing, but haven’t found a great name for yet. I call it “stratification”.

Basically, the more you write high-level libraries that wrap lower-level ones, the less you use the low-level libraries. Thus, the less you understand them. At some point, you forget they even exist. At that point, you will - inevitably - write an even-higher-level library that recreates the lower-level functionality ON TOP OF the high-level library.

Does size matter? We all agree that accidental complexity is bad and should be removed, but when the code actually is clean but still large - do we just accept that or should we take extraordinary measures like avoiding refactoring or using another programming language to keep the size down? How important is size?

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Size does not matter by Dmitriy Setrakyan

Size does not matter (at least in software development) ;-)

What matters is readability and design. The former probably being more important than the latter.

I, for instance, would not want to maintain Steve's stripped down, design-patternless collection of text characters compiled into Java code regardless of how many lines of code it is. If I was him, I would look into Continuous Integration and Regression Tests to improve maintainability instead of spending time counting code lines.

Best,
Dmitriy Setrakyan
GridGain - Grid Computing Made Simple

When small becomes too small... by Ronald Miura

When you consider only one 'quality' and ignore everything else, the maintainability will *always* suffer.

Well, if only code size matters, the examples listed in the links below are wonderful examples of quality and maintainability :)

homepages.cwi.nl/~tromp/maze.html
www.cise.ufl.edu/~manuel/obfuscate/obfuscate.html

Yes actually, sometimes size is annoying by Jesse Kuhnert

I really hate to pick on this guy's blog entry as I don't know what his actual opinion is on these things, but it does such a great job supporting this point that I can't help but point it out.

some stuff about value objects

The article talks about some of the verbosity/duplication of code that goes in to creating java classes in what is probably a very common situation for almost any java developer.

This may be getting to specific as an example, but if anyone has the intellij ToString Generator plugin installed and are using it you will know what I mean. It's your typical toString() IDE kind of plugin, except that they decided to go a little farther than the norm and actually provide this cool little velocity like textual template for you to use to define ~how~ and what the generated method looks like - exactly...

in pseudo code form it kind of looks like:

public String toString()
{
${someStringLikeThing} str = "${class.name}[";
for (${member} in ${class}) {
str += ${member.name} + ":" + ${member.value} + ",");
}
str += "]";
}

Only in some languages you can actually write this as a module of real actual code in that language and only do it once, and it won't matter if you use an IDE or vi or punch cards because it will all be the same since it's part of the language. You just added a "little something extra" to it.

It's awesome and I really really hope he succeeds in doing it because it sure as hell is a lot more fun to program in than java even if java pays better..

Re: Yes actually, sometimes size is annoying by luca barba

Step by step we return to C++ and templates ....
I used to be a sort of expert of this and i can say that right is in the middle.
Extreme synthesis is negative for comprehension like too much code.
"Stratification" is really a good thing, does someone remembers Assembler here ? Or processor code ? Or to go really to an higher level bytecode?

Stratification is good practice, the problem is that the good developer MUST continue to study all the levels to use the correct one at the right time, and this is something people doesn't love to do.
Is really simpler from a brain energy expenditure point of view just to reinvent the wheel over the well known concepts instead of learning new concepts.

idea is ok. but again the loud reasoning by tomasz gajewski

it's an another idea, where this guy (Steve Yegge) loves to provocate. he picks the most trendy practices and weaves into his idea, claiming they don't work. Arguments? "according to his experinces" - that's all?!

Last time he was saying why Agile (XP) doesn't work and why Google way does it. Why? He works over there and "according to his exeperinces" ... sometimes a programmer is in flow and does overtime, which Google allows to.

Now? Large code doesn't work. although basically I do agree with that, I would never attack Refactoring and D.Patterns. Say at least 1 bad smell, where larger code is advised? None. Design Pattern? Yeahh, there are some. But about it (why some guys can't write "hello world" without factories ) lots of laugh was made already by J.Kerievsky (Refactoring to Patterns) and many others. Just know the "project size" you play in.
1 month long project, and I can do everything, even with my eyes closed. Nothing can get spoiled. 1 year long project and I start thinking about the hated "stratification", refactoring and d. patterns.

Complexity matters by Randolph Kahle

Verbosity is different from complexity.

In my experience reducing complexity is most important to minimize the life-time cost of software. Design patterns help reduce complexity as they allow you to comprehend large amounts of code with a minimal number of mental images.

I have assessed many projects and I find that I can tell, by watching how code changes over time, when teams have been faithful to the architectural patterns and when they start to bypass them. I have found that architectures degrade over time when teams are under pressure to modify software and they elect to bypass the patterns due to perceived time pressure. Once patterns are bypassed subsequent changes continue to degrade a system and if not refactored back to the intended patterns, the system will eventually become too expensive to maintain and must be replaced.

Randy Kahle
1060 Research, Ltd

Verbosity, Complexity, Density, Readability by Dan Tines

It's pretty easy to beat up on Java. Even C# 3.0 makes Java look boiler-plate ridden. But a lot of this is also personal preferences, and the great Java IDEs do help out a lot. I have a feeling that Yegge doesn't quite understand the pitfalls of "dynamic all the time" either. It's just not needed.

Dependency Injection is about dependencies ... by Kevin Teague


Dependency Injection is an amazingly elaborate infrastructure for making Java more dynamic in certain ways that are intrinsic to higher-level languages. And – you guessed it – DI makes your Java code base bigger.


There is irony in the fact that we add more code to large code bases to make them more manageable. Ideas such as dependency injection though can provide great value to larger code bases since they put focus on making explicit the dependencies between your components. If you've got a large code base and it calls other code willy-nilly throughout the whole system of course you're going to run into huge maintainence headaches that make "throw it all away and start over" seem like a good idea.

It's true that most dynamic language programmers haven't heard of DI, but the idea of making dependencies explicit is just as relevant in a dynamic langauge as a static language for any significantly sized code base. For example, DI was considered for Ruby on Rails, and in Python we have Zope 3 which contains patterns that while not typically called DI, are at least very DI-like.

Re: Yes actually, sometimes size is annoying by Stefan Tilkov

Instead of using an IDE that comes with a little template language, you might want to consider a language that includes features for this kind of thing itself (Ruby being one example).

He was going so well for a change, then... by Paul Oldfield

I don't often agree with Steve Yegge's ideas, but I definitely agree that much of the code I've seen could be cut down to about 40% of the volume without changing functionality or losing readability, etc. This is something that should be done. Then, if we remove the features that never get used, the code could often get smaller still. This is important; it reduces the cost of any maintenance we need to do.

Without reading his original blog, I can also agree that it seems often to be the case that patterns get used because they're there, not because they're needed. I've *never* used a pattern 'straight out of the book' (Hmm... I think I did a Singleton by the book once, but it's the exception that proves the rule). I use the ideas that underlie the patterns to bring about the separation of concerns that the situation needs. This approach uses just enough code, and no more... ideally. I'm not perfect...

Where I disagree strongly with what he's reported to have said is where he says refactoring increases the size of the code. If that's the effect he gets, then he's not doing it right. Refactoring should ideally be used only when we have new functionality to be added, and it should be used so we get to re-use as much as reasonably possible of what we already have. When used this way, the alternative to refactoring is to write duplicate functionality. Now as far as I can see, if you write the same functionality twice, it's highly likely to take more code than if you write it only once.

I could be wrong, of course.

Re: Yes actually, sometimes size is annoying by Peter Lawrey

One way to simplify the equals/hashCode/toString generation is to use reflections.
I have some helper methods which support calls like ClassUtils.toString(this) which use reflection to get the public fields/getters and display them. Subclasses also work correctly so I typically have a DataValue class which all data value structures inherit.

What about efficiency you ask? Well generally its not a problem, however when it is I also generate the code for the equivalent code at runtime for these classes, but thats OTT for most use cases.

The generated data value objects can be a more efficient replacement for Map of String to value in some cases.

Re: Complexity matters by William Bohrer

I love your second paragraph, it's precisely my experience as well, having been in software consulting for a number of years.

On of my pet peeves is trying to explain the difference between "complex" vs. "Complicated". Complex code is necessarily difficult to comprehend because the problem it's solving is difficult. It may contain "a lot of lines", or it may be really "dense" - without good comments, it's impossible to understand. On the other hand, "complicated" code is code that is tedious and/or difficult to maintain, but the underlying concepts being implemented may be straightforward enough. For example, a crappily written 10-year-old ordering system that is built on top of a poorly designed database, can be hideous to maintain, but the problem is with the code, not the domain itself. Buying and selling stuff is pretty straightforward, and even with the addition of some hairy business rules about who's allowed to do what when, the buying and selling part should be really straightforward and neither "complicated" nor "complex".

Implementing a particular pattern because its the new fashion can add piles of objects to code; on the other hand, writing 50 page methods following no discernible design methods (I've seen this, inherited it, spent many months trying to figure out what it was doing) ALSO creates unnecessary complexity. You might as well be writing Fortran at that point. If you broke that method into functional pieces it might very well have more "lines" but I can guarantee it would be easier for someone other than the person who wrote it, to maintain.

The original author of this thread is writing a one-man app, so concern over whether anyone else can understand it is not a requirement. In most enterprises, developers don't have the luxury of being a one-person silo, and even then, I insist that all code be peer-reviewed, because you could get run over by a bus tomorrow and I don't want to have to replace everything you did because noone can understand it, no matter how clever or concise it might be.

Anyhow, my 50c, rather late in the game.

Bill Bohrer
william.bohrer*tea.state.tx.us
Systems Architect
Texas Education Agency

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

12 Discuss
BT