BT

Debate: Why are most large-scale websites not written in Java?

| by Ryan Slobojan Follow 0 Followers on Oct 29, 2007. Estimated reading time: 4 minutes |

Nati Shalom of GigaSpaces recently asked why most large-scale websites were written in languages other than Java. This question touched off a large debate in the Java community, and InfoQ took the opportunity to learn more about the major viewpoints surrounding this issue.

In his post, Shalom noted that many of the sites that he knew of used a LAMP (Linux, Apache, MySQL, PHP/Perl) stack, and that several have developed custom filesystems like Google's GFS or utilized caches like memcached. Shalom noted similarities in the scalability solutions developed for both large-scale web applications and large-scale financial applications:

On the Data Tier we see the following:
  1. Adding a caching layer to take advantage of memory resources availability and reduce I/O overhead
  2. Moving from a database-centric approach to partitioning, aka shards
On the Business Logic Tier:
  1. Adding parallelization semantics to the application tier (e.g., MapReduce)
  2. Moving to scale-out application models to achieve linear scalability
  3. Moving away from the classic two-phase commit and XA for transaction processing  (See: Lessons from Pat Helland: Life Beyond Distributed Transactions)

Shalom then questioned how these similar solutions could have such different application stacks. One possible reason, which Shalom noted, was put forward by Todd Hoff - the LAMP stack is both powerful and free, and Java is used but as an ancillary component rather than as the core.

Some other opinions:

  • Justin Sher was quick to point out that eBay, GMail, Amazon, hi5.com and Google AdWords are built on top of Java
  • Shane Isbell pointed to cultural differences, questioning whether the stereotypical web developer is more interested in social networking sites and 'eye candy' than the stereotypical Java developer, and also commented that financial companies had greater budgets and tended to scale with hardware, whereas web companies tended to scale with software.
  • Another person suggested that the prevalence of Java solutions in financial applications had to do with partnerships between large Java EE vendors and financial institutions
  • Angelo Andreetto, who referred to several years of experience with financial companies, believes that a conservative approach to potential risk leads to the selection of Java-based solutions over heterogeneous software stacks
  • Someone else commented that the consequences of downtime for financial institutions were generally larger than for web companies
  • George Coller said that the question was mis-stated, and that the question should really be why isn't Java EE used more

Mickey Ohayon of GigaSpaces had a more detailed response:

In a technical perspective:
  • developing in Php / Perl is very fast and simple whereas JEE is more complex
  • historically speaking the knowledge, hosting services and developers are more available
  • LAMP proved to be stable and common whereas JEE was more of an infrastructure
  • JEE requires application servers that sometimes are overkill for a web system
  • The light web languages (Php/Perl) are more flexible to changes in the short run (as part of poor architecture that is based on Non-MVC, of course in the long run the cost of any change is dramatically higher)
  • The deployment and testing of java application is far slower and requires relatively strong machines
In financial perspective
  • JEE developers are far more expensive than Perl / Php
  • The learning curve and time to market are longer
  • Hosting of JEE application servers is more expensive

Jilles Van Gurp of Nokia commented that Java EE is optimized for the enterprise domain, which tends to have a different set of needs different than a large-scale consumer-oriented website:

These websites have relatively simple data base structures; relaxed requirements for things like transactions and persistence layers (mysql + non-transactional & ACID backend is good enough in most cases); virtually no requirements for heavy duty web service stacks; etc. Basically all the stuff J2EE is excellent for is just mostly overkill for implementing consumer oriented websites. You don't need the fancy IDEs; uber-flexible messaging buses; outrageously complicated transactional logic; etc.

Instead the focus is on extreme scalability; memory usage; cpu usage; caching; etc. Those things can be addressed with off the shelf components like squid, apache, distributed linux filesystems etc. They can also be addressed with Java components too but it requires that you have some J2EE experts around to integrate them. These are not exactly easy to recruit due to current scarcity on the job market and tendency of these people to end up in extremely well payed enterprise type jobs.

Van Gurp also believes that Java is well positioned for the future:

Finally, I think all this is changing. Running the Java implementation of ruby or php can give a nice security, performance, scalability and managability boost to your php or rails application. You'd be a fool not to try this if you are operating large scale deployments of these systems. This is still relatively unknown to php and ruby developers and quite many simply don't care about performance enough to do anything about it, instead preferring to invest in hardware. But once they make the shift to deploying on php or ruby on Java application servers, they'll discover that there is a world of additional components that can further enhance their applications. Arguably Google's web development tool chain (partially open sourced) is the state of the art in extremely large scale & rapid protyping web development. And writing the application logic is done 100% in Java from the web developer point of view. To the best of my knowledge, Google has no large scale deployment of php or similar architectures in their web UI layer (I'd be interested to learn if this is not true).

After watching the debate unfold, Shalom described his agreement with Michael O'Keefe's opinion, which encompassed several of the viewpoints described above. Shalom also mentioned that there appeared to be a convergence trend in the market, with tools such as Spring on Rails and Caucho's Java-based PHP implementation, and that the challenge of developing a scalable site would bring LAMP stacks and Java closer together in the future.

What do you think?

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Like any market, barriers to entry is key....... by Ben Hughes

One of the problems with Java based large scale application development is its barrier to entry - from the cost of hardware to run a highly available distributed application, the cost of (good) java developers to build it, and crucially to the cost of learning. Its understandable that some organisations might err on the side of commodity hardware, cheaper developers and shorter learning curve - paticularly where equally meets the business need.

From a learning perspective Java often lacks the 'convention over configuration' offered by the new (and convergent frameworks). You can install the entire stack on Ubuntu with a few keystrokes, using RubyWorks you have an out of the box Rails infrastructure that will scale to meet most requirements. Where's the Java alternative? While Java has grown to be infintely configurable this I'm sure puts people off. Being directed what to do (convention) is certainly be a lower barrier than being offered a thousand (configuration) options.

With this lower learning overhead, we can shorten the path to understanding the the meat of the scalable discussion - the 'new' architecture patterns being lived out by your Facebook's, Ebay's & Twitters.

The correlation between large-scale websites & large-scale applications is? by James Richardson

There are plenty of huge applications that are not visible to the world that run on all sorts of architectures. Just because somebody isn't shouting about it on the "blog-o-sphere" (aka. "please, i only want to be famous") doesn't mean that plenty of it isn't going about.
Even in the case where the actual web application isn't running on java/j2ee you can bet that huge amounts of the business applications behind may well be. (or of course on MQ, Cobol, SAP or some other untrendy thing)

Of course the fact that Gigaspaces sponsors most of this site means they will get a bunch of exposure for their GoogleJuice.

Re: Like any market, barriers to entry is key....... by Luis Garcia

Where's the Java alternative?

I think grails is heading in that direction.

Java has many more framework choices than, say, ruby, python, or even PHP (which is getting better), which reflects its strength as a language/platform. However, one must do a lot of research into the best possible set of tools for a solution, and sometimes that prospect is just way too daunting. Especially if chosen for the wrong reasons.

Hence the decision to use rails, zend, or whatever is a lot more palatable, and these people are a lot cheaper and tend to be able to whip up decent solutions quickly.

Horses for courses really. I personally would lean towards a java-based platform simply for its robustness and all the nice services you get from the EE stack. And the other goodies like Spring, commons, JMX etc.

Re: Like any market, barriers to entry is key....... by Michael Neale

There is also a desire just to "keep it simple" for web apps thats primary reason is to quickly build an interface into some sort of a database. Layers are not needed. PHP shines at this, and it really relies on the OS for any services it needs. Rails is kind of further up the "software engineering ladder" and provides a lot more structure. In both cases the frameworks don't try to do much, leaning on the OS and allowing you to call out to do it in the rare case that you go beyond shuffling data into and out of a database. I can appreciate that simplicity.

Re: Like any market, barriers to entry is key....... by Thom Nichols

Agreed. Grails is the first (Java) solution I'm aware of that brings it all together so you can just start writing a web application without tons of configuration crap. AppFuse is pretty close too, but in my experience I was still dealing with a lot of different frameworks that didn't always play nice together.

Web designers are generally not software engineers first so they tend to pick up a simpler programming language. RoR is the first framework that seems to be written from the web designer perspective, looking to give a great application solution. Grails is coming from the Java software engineer perspective looking to give developers a great web app solution. They both seem to be finding that sweet spot right in the middle.

Re: Like any market, barriers to entry is key....... by Johan Compagner

if you want to have a java framework that doesn't have configuration crap look at wicket: wicket.apache.org

promoting Linux languages!!! by Rajmahendra R

Is this article is promoting Linux languages comparing mostly using PHP/LAMP/Perl!!!!!!!

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

7 Discuss
BT