InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Does lines of code kill?

Posted by Niclas Nilsson on Dec 20, 2007

Sections
Development,
Architecture & Design
Topics
Programming ,
Artifacts & Tools ,
Architecture
Tags
Design Patterns ,
Dependency Injection ,
Eclipse ,
Refactoring ,
Languages
Steve Yegge touched a nerve in the development community when he posted his latest blog post. Steve argues that keeping the code size to an absolute minimum is the most important thing when developing software. In his view, you may have to sacrifice some design patterns and avoid refactoring at times just to keep the lines of code down. And if your problem is large enough - you may have to switch to another programming language.

… I believe, quite staunchly I might add, that the worst thing that can happen to a code base is size.

According to Steve, code size kills:

My minority opinion is that a mountain of code is the worst thing that can befall a person, a team, a company. I believe that code weight wrecks projects and companies, that it forces rewrites after a certain size, and that smart teams will do everything in their power to keep their code base from becoming a mountain.

Steve says that he could have called it code bloat, but the problem is that developers does not recognize bloat, and he’s not really talking about the common accidental complexity.

I say “size” as a placeholder for a reasonably well-formed thought for which I seem to have no better word in my vocabulary. I’ll have to talk around it until you can see what I mean, and perhaps provide me with a better word for it. The word “bloat” might be more accurate, since everyone knows that “bloat” is bad, but unfortunately most so-called experienced programmers do not know how to detect bloat, and they’ll point at severely bloated code bases and claim they’re skinny as a rail.

The background of the blog post is that Steve have written an online game in Java that is now 500.000 lines of code and due to the size of the code base, he can no longer maintain it himself. A while ago, he took the game down and he’s currently rewriting the game in Javascript.

I say my opinion is hard-won because people don’t really talk much about code base size; it’s not widely recognized as a problem. In fact it’s widely recognized as a non-problem.

But how about tools? Can’t tools can ease the management of the code?

People in the industry are very excited about various ideas that nominally help you deal with large code bases, such as IDEs that can manipulate code as “algebraic structures”, and search indexes, and so on. These people tend to view code bases much the way construction workers view dirt: they want great big machines that can move the dirt this way and that.

Many developers will agree with Steve so far. People who have been introduced to a large code base knows that lines of code can be painful in itself.

If you have a million lines of code, at 50 lines per “page”, that’s 20,000 pages of code. How long would it take you to read a 20,000-page instruction manual? The effort to simply browse the code base and try to discern its overall structure could take weeks or even months, depending on its density. Significant architectural changes could take months or even years.

But Steve becomes more radical than most developers and suggests that it may be a good choice to avoid design patterns and refactorings in order to minimize the code base.

The problem with Refactoring as applied to languages like Java, and this is really quite central to my thesis today, is that Refactoring makes the code base larger. I’d estimate that fewer than 5% of the standard refactorings supported by IDEs today make the code smaller.

And about design patterns, he writes:

And design patterns – at least most of the patterns in the “Gang of Four” book – make code bases get bigger. Tragically, the only GoF pattern that can help code get smaller (Interpreter) is utterly ignored by programmers who otherwise have the names of Design Patterns tatooed on their various body parts.

A little while ago, InfoQ summarized a debate about dependency injection, and Steve puts DI in the code bloat camp as well:

Dependency Injection is an example of a popular new Java design pattern that programmers using Ruby, Python, Perl and JavaScript have probably never heard of. And if they’ve heard of it, they’ve probably (correctly) concluded that they don’t need it. Dependency Injection is an amazingly elaborate infrastructure for making Java more dynamic in certain ways that are intrinsic to higher-level languages. And – you guessed it – DI makes your Java code base bigger.

Bigger is just something you have to live with in Java. Growth is a fact of life. Java is like a variant of the game of Tetris in which none of the pieces can fill gaps created by the other pieces, so all you can do is pile them up endlessly.

In the comment section there is a lively debate. Many people think that the solution is to break things up into libraries and thereby avoiding to need to understand all the code, at least at once. Udi Dahan asks:

Suppose you structured your code in such a way that you only had to look at under 1000 LoC at a time in order to do useful work. Would the 500KLoC be as big a problem?

Jay Levitt steps in, disagrees with Udi, and he coins the term stratification to describe what he means.

There’s an anti-pattern I keep seeing, but haven’t found a great name for yet. I call it “stratification”.

Basically, the more you write high-level libraries that wrap lower-level ones, the less you use the low-level libraries. Thus, the less you understand them. At some point, you forget they even exist. At that point, you will - inevitably - write an even-higher-level library that recreates the lower-level functionality ON TOP OF the high-level library.

Does size matter? We all agree that accidental complexity is bad and should be removed, but when the code actually is clean but still large - do we just accept that or should we take extraordinary measures like avoiding refactoring or using another programming language to keep the size down? How important is size?

12 comments

Watch Thread Reply

Size does not matter by Dmitriy Setrakyan Posted
Yes actually, sometimes size is annoying by Jesse Kuhnert Posted
Re: Yes actually, sometimes size is annoying by luca barba Posted
Re: Yes actually, sometimes size is annoying by Stefan Tilkov Posted
Re: Yes actually, sometimes size is annoying by Peter Lawrey Posted
When small becomes too small... by Ronald Miura Posted
idea is ok. but again the loud reasoning by tomasz gajewski Posted
Complexity matters by Randolph Kahle Posted
Re: Complexity matters by William Bohrer Posted
Verbosity, Complexity, Density, Readability by Dan Tines Posted
Dependency Injection is about dependencies ... by Kevin Teague Posted
He was going so well for a change, then... by Paul Oldfield Posted
  1. Back to top

    Size does not matter

    by Dmitriy Setrakyan

    Size does not matter (at least in software development) ;-)

    What matters is readability and design. The former probably being more important than the latter.

    I, for instance, would not want to maintain Steve's stripped down, design-patternless collection of text characters compiled into Java code regardless of how many lines of code it is. If I was him, I would look into Continuous Integration and Regression Tests to improve maintainability instead of spending time counting code lines.

    Best,
    Dmitriy Setrakyan
    GridGain - Grid Computing Made Simple

  2. Back to top

    When small becomes too small...

    by Ronald Miura

    When you consider only one 'quality' and ignore everything else, the maintainability will *always* suffer.

    Well, if only code size matters, the examples listed in the links below are wonderful examples of quality and maintainability :)

    homepages.cwi.nl/~tromp/maze.html
    www.cise.ufl.edu/~manuel/obfuscate/obfuscate.html

  3. Back to top

    Yes actually, sometimes size is annoying

    by Jesse Kuhnert

    I really hate to pick on this guy's blog entry as I don't know what his actual opinion is on these things, but it does such a great job supporting this point that I can't help but point it out.

    some stuff about value objects

    The article talks about some of the verbosity/duplication of code that goes in to creating java classes in what is probably a very common situation for almost any java developer.

    This may be getting to specific as an example, but if anyone has the intellij ToString Generator plugin installed and are using it you will know what I mean. It's your typical toString() IDE kind of plugin, except that they decided to go a little farther than the norm and actually provide this cool little velocity like textual template for you to use to define ~how~ and what the generated method looks like - exactly...

    in pseudo code form it kind of looks like:

    public String toString()
    {
    ${someStringLikeThing} str = "${class.name}[";
    for (${member} in ${class}) {
    str += ${member.name} + ":" + ${member.value} + ",");
    }
    str += "]";
    }

    Only in some languages you can actually write this as a module of real actual code in that language and only do it once, and it won't matter if you use an IDE or vi or punch cards because it will all be the same since it's part of the language. You just added a "little something extra" to it.

    It's awesome and I really really hope he succeeds in doing it because it sure as hell is a lot more fun to program in than java even if java pays better..

  4. Back to top

    Re: Yes actually, sometimes size is annoying

    by luca barba

    Step by step we return to C++ and templates ....
    I used to be a sort of expert of this and i can say that right is in the middle.
    Extreme synthesis is negative for comprehension like too much code.
    "Stratification" is really a good thing, does someone remembers Assembler here ? Or processor code ? Or to go really to an higher level bytecode?

    Stratification is good practice, the problem is that the good developer MUST continue to study all the levels to use the correct one at the right time, and this is something people doesn't love to do.
    Is really simpler from a brain energy expenditure point of view just to reinvent the wheel over the well known concepts instead of learning new concepts.

  5. Back to top

    idea is ok. but again the loud reasoning

    by tomasz gajewski

    it's an another idea, where this guy (Steve Yegge) loves to provocate. he picks the most trendy practices and weaves into his idea, claiming they don't work. Arguments? "according to his experinces" - that's all?!

    Last time he was saying why Agile (XP) doesn't work and why Google way does it. Why? He works over there and "according to his exeperinces" ... sometimes a programmer is in flow and does overtime, which Google allows to.

    Now? Large code doesn't work. although basically I do agree with that, I would never attack Refactoring and D.Patterns. Say at least 1 bad smell, where larger code is advised? None. Design Pattern? Yeahh, there are some. But about it (why some guys can't write "hello world" without factories ) lots of laugh was made already by J.Kerievsky (Refactoring to Patterns) and many others. Just know the "project size" you play in.
    1 month long project, and I can do everything, even with my eyes closed. Nothing can get spoiled. 1 year long project and I start thinking about the hated "stratification", refactoring and d. patterns.

  6. Back to top

    Complexity matters

    by Randolph Kahle

    Verbosity is different from complexity.

    In my experience reducing complexity is most important to minimize the life-time cost of software. Design patterns help reduce complexity as they allow you to comprehend large amounts of code with a minimal number of mental images.

    I have assessed many projects and I find that I can tell, by watching how code changes over time, when teams have been faithful to the architectural patterns and when they start to bypass them. I have found that architectures degrade over time when teams are under pressure to modify software and they elect to bypass the patterns due to perceived time pressure. Once patterns are bypassed subsequent changes continue to degrade a system and if not refactored back to the intended patterns, the system will eventually become too expensive to maintain and must be replaced.

    Randy Kahle
    1060 Research, Ltd

  7. Back to top

    Verbosity, Complexity, Density, Readability

    by Dan Tines

    It's pretty easy to beat up on Java. Even C# 3.0 makes Java look boiler-plate ridden. But a lot of this is also personal preferences, and the great Java IDEs do help out a lot. I have a feeling that Yegge doesn't quite understand the pitfalls of "dynamic all the time" either. It's just not needed.

  8. Back to top

    Dependency Injection is about dependencies ...

    by Kevin Teague


    Dependency Injection is an amazingly elaborate infrastructure for making Java more dynamic in certain ways that are intrinsic to higher-level languages. And – you guessed it – DI makes your Java code base bigger.


    There is irony in the fact that we add more code to large code bases to make them more manageable. Ideas such as dependency injection though can provide great value to larger code bases since they put focus on making explicit the dependencies between your components. If you've got a large code base and it calls other code willy-nilly throughout the whole system of course you're going to run into huge maintainence headaches that make "throw it all away and start over" seem like a good idea.

    It's true that most dynamic language programmers haven't heard of DI, but the idea of making dependencies explicit is just as relevant in a dynamic langauge as a static language for any significantly sized code base. For example, DI was considered for Ruby on Rails, and in Python we have Zope 3 which contains patterns that while not typically called DI, are at least very DI-like.

  9. Back to top

    Re: Yes actually, sometimes size is annoying

    by Stefan Tilkov

    Instead of using an IDE that comes with a little template language, you might want to consider a language that includes features for this kind of thing itself (Ruby being one example).

  10. Back to top

    He was going so well for a change, then...

    by Paul Oldfield

    I don't often agree with Steve Yegge's ideas, but I definitely agree that much of the code I've seen could be cut down to about 40% of the volume without changing functionality or losing readability, etc. This is something that should be done. Then, if we remove the features that never get used, the code could often get smaller still. This is important; it reduces the cost of any maintenance we need to do.

    Without reading his original blog, I can also agree that it seems often to be the case that patterns get used because they're there, not because they're needed. I've *never* used a pattern 'straight out of the book' (Hmm... I think I did a Singleton by the book once, but it's the exception that proves the rule). I use the ideas that underlie the patterns to bring about the separation of concerns that the situation needs. This approach uses just enough code, and no more... ideally. I'm not perfect...

    Where I disagree strongly with what he's reported to have said is where he says refactoring increases the size of the code. If that's the effect he gets, then he's not doing it right. Refactoring should ideally be used only when we have new functionality to be added, and it should be used so we get to re-use as much as reasonably possible of what we already have. When used this way, the alternative to refactoring is to write duplicate functionality. Now as far as I can see, if you write the same functionality twice, it's highly likely to take more code than if you write it only once.

    I could be wrong, of course.

  11. Back to top

    Re: Yes actually, sometimes size is annoying

    by Peter Lawrey

    One way to simplify the equals/hashCode/toString generation is to use reflections.
    I have some helper methods which support calls like ClassUtils.toString(this) which use reflection to get the public fields/getters and display them. Subclasses also work correctly so I typically have a DataValue class which all data value structures inherit.

    What about efficiency you ask? Well generally its not a problem, however when it is I also generate the code for the equivalent code at runtime for these classes, but thats OTT for most use cases.

    The generated data value objects can be a more efficient replacement for Map of String to value in some cases.

  12. Back to top

    Re: Complexity matters

    by William Bohrer

    I love your second paragraph, it's precisely my experience as well, having been in software consulting for a number of years.

    On of my pet peeves is trying to explain the difference between "complex" vs. "Complicated". Complex code is necessarily difficult to comprehend because the problem it's solving is difficult. It may contain "a lot of lines", or it may be really "dense" - without good comments, it's impossible to understand. On the other hand, "complicated" code is code that is tedious and/or difficult to maintain, but the underlying concepts being implemented may be straightforward enough. For example, a crappily written 10-year-old ordering system that is built on top of a poorly designed database, can be hideous to maintain, but the problem is with the code, not the domain itself. Buying and selling stuff is pretty straightforward, and even with the addition of some hairy business rules about who's allowed to do what when, the buying and selling part should be really straightforward and neither "complicated" nor "complex".

    Implementing a particular pattern because its the new fashion can add piles of objects to code; on the other hand, writing 50 page methods following no discernible design methods (I've seen this, inherited it, spent many months trying to figure out what it was doing) ALSO creates unnecessary complexity. You might as well be writing Fortran at that point. If you broke that method into functional pieces it might very well have more "lines" but I can guarantee it would be easier for someone other than the person who wrote it, to maintain.

    The original author of this thread is writing a one-man app, so concern over whether anyone else can understand it is not a requirement. In most enterprises, developers don't have the luxury of being a one-person silo, and even then, I insist that all code be peer-reviewed, because you could get run over by a bus tomorrow and I don't want to have to replace everything you did because noone can understand it, no matter how clever or concise it might be.

    Anyhow, my 50c, rather late in the game.

    Bill Bohrer
    william.bohrer*tea.state.tx.us
    Systems Architect
    Texas Education Agency

Educational Content

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.

Beauty Is in the Eye of the Beholder

Alex Papadimoulis discusses ugly code, where it comes from, how to avoid it, and how to get rid of it.

Architecting Visa for Massive Scale and Continuous Innovation

John Davies examines Visa’s architecture and shows how enterprises have architected complex integrations incorporating Hadoop, memcached, Ruby on Rails, and others to deliver innovative solutions.

Max Protect: Scalability and Caching at ESPN.com

Sean Comerford unveils ESPN.com’s architecture, what components are used and why, and the current changes the website goes through.

The Seven Deadly Sins of Enterprise Agile Adoption

Are there repeated patterns of failure on Enterprise Agile Enablement efforts? Sanjiv and Arlen discuss Seven Deadly Sins to avoid when adopting Agile in an enterprise.

Questions for an Enterprise Architect

Erik Dörnenburg answers: What is Enterprise and Evolutionary Architecture?, discussing 4 issues: Turning strategy into execution, Ensuring conformance, Where do the architects sit? Buying or building?

Wrap Your SQL Head Around Riak MapReduce

Sean Cribbs explains what Map-Reduce and Riak are, why and how to use Map-Reduce with Riak, and how to convert SQL queries into their Map-Reduce equivalents.