Does lines of code kill?

Steve Yegge touched a nerve in the development community when he posted his latest blog post. Steve argues that keeping the code size to an absolute minimum is the most important thing when developing software. In his view, you may have to sacrifice some design patterns and avoid refactoring at times just to keep the lines of code down. And if your problem is large enough - you may have to switch to another programming language.

… I believe, quite staunchly I might add, that the worst thing that can happen to a code base is size.

According to Steve, code size kills:

My minority opinion is that a mountain of code is the worst thing that can befall a person, a team, a company. I believe that code weight wrecks projects and companies, that it forces rewrites after a certain size, and that smart teams will do everything in their power to keep their code base from becoming a mountain.

Steve says that he could have called it code bloat, but the problem is that developers does not recognize bloat, and he’s not really talking about the common accidental complexity.

I say “size” as a placeholder for a reasonably well-formed thought for which I seem to have no better word in my vocabulary. I’ll have to talk around it until you can see what I mean, and perhaps provide me with a better word for it. The word “bloat” might be more accurate, since everyone knows that “bloat” is bad, but unfortunately most so-called experienced programmers do not know how to detect bloat, and they’ll point at severely bloated code bases and claim they’re skinny as a rail.

The background of the blog post is that Steve have written an online game in Java that is now 500.000 lines of code and due to the size of the code base, he can no longer maintain it himself. A while ago, he took the game down and he’s currently rewriting the game in Javascript.

I say my opinion is hard-won because people don’t really talk much about code base size; it’s not widely recognized as a problem. In fact it’s widely recognized as a non-problem.

But how about tools? Can’t tools can ease the management of the code?

People in the industry are very excited about various ideas that nominally help you deal with large code bases, such as IDEs that can manipulate code as “algebraic structures”, and search indexes, and so on. These people tend to view code bases much the way construction workers view dirt: they want great big machines that can move the dirt this way and that.

Many developers will agree with Steve so far. People who have been introduced to a large code base knows that lines of code can be painful in itself.

If you have a million lines of code, at 50 lines per “page”, that’s 20,000 pages of code. How long would it take you to read a 20,000-page instruction manual? The effort to simply browse the code base and try to discern its overall structure could take weeks or even months, depending on its density. Significant architectural changes could take months or even years.

But Steve becomes more radical than most developers and suggests that it may be a good choice to avoid design patterns and refactorings in order to minimize the code base.

The problem with Refactoring as applied to languages like Java, and this is really quite central to my thesis today, is that Refactoring makes the code base larger. I’d estimate that fewer than 5% of the standard refactorings supported by IDEs today make the code smaller.

And about design patterns, he writes:

And design patterns – at least most of the patterns in the “Gang of Four” book – make code bases get bigger. Tragically, the only GoF pattern that can help code get smaller (Interpreter) is utterly ignored by programmers who otherwise have the names of Design Patterns tatooed on their various body parts.

A little while ago, InfoQ summarized a debate about dependency injection, and Steve puts DI in the code bloat camp as well:

Dependency Injection is an example of a popular new Java design pattern that programmers using Ruby, Python, Perl and JavaScript have probably never heard of. And if they’ve heard of it, they’ve probably (correctly) concluded that they don’t need it. Dependency Injection is an amazingly elaborate infrastructure for making Java more dynamic in certain ways that are intrinsic to higher-level languages. And – you guessed it – DI makes your Java code base bigger.

Bigger is just something you have to live with in Java. Growth is a fact of life. Java is like a variant of the game of Tetris in which none of the pieces can fill gaps created by the other pieces, so all you can do is pile them up endlessly.

In the comment section there is a lively debate. Many people think that the solution is to break things up into libraries and thereby avoiding to need to understand all the code, at least at once. Udi Dahan asks:

Suppose you structured your code in such a way that you only had to look at under 1000 LoC at a time in order to do useful work. Would the 500KLoC be as big a problem?

Jay Levitt steps in, disagrees with Udi, and he coins the term stratification to describe what he means.

There’s an anti-pattern I keep seeing, but haven’t found a great name for yet. I call it “stratification”.

Basically, the more you write high-level libraries that wrap lower-level ones, the less you use the low-level libraries. Thus, the less you understand them. At some point, you forget they even exist. At that point, you will - inevitably - write an even-higher-level library that recreates the lower-level functionality ON TOP OF the high-level library.

Does size matter? We all agree that accidental complexity is bad and should be removed, but when the code actually is clean but still large - do we just accept that or should we take extraordinary measures like avoiding refactoring or using another programming language to keep the size down? How important is size?

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the Eclipse topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter