Java Grid, why do we need it!
I caught up with John Davies while at JavaZone in Olso, Norway. During our encounter we started to talk about John's experiences in the banking industry in London. While business application developers may be faced with scaling to thousands of transactions per second, these banks are faced with tens of thousands of transactions per second. And those rates are climbing steadily. Front office success is all about winning the many race conditions that exists in the market and to do so banks have been willing to invest enormous sums into IT renewal.
In making these investments banks have often set trends or offer the rest of us a glimpse into the future of our IT infrastructure. Today banks are investing heavily into grid technologies as they continue in their battle for profits. What follows here is a John Davies stream of conciseness that in showing us where we've been, shows us where we are going. Without further ado, here is John Davies on Java Grid and why do we need it!
It doesn't seem that long ago that I was programming in C on a 25MHz Compaq. At the time it was the fastest thing around, a few months later and the 33MHz version came out and was blisteringly fast. Compared to the PDP-11 and 4MHz 8080 and Z80s I started on who would ever need anything faster than this?
At the time we were leading edge, we gave the banks a few seconds advantage over the competition and that made them serious money, this was the "Big Bang" days when the stock markets opened up. It's worth noting that the network was so slow that you could visible see the difference between updates from one machine and another, this created arguments between traders as they wanted to be closer to the start of the network ring, still this was progress over the RS-232 "network" we used before.
Over the coming years I was paid to upgrade other dealing room systems to compete with the first ones I'd worked on, each time we went faster and faster. Not just because of Moore's law but we'd also learnt lessons in how to and how not to design high performance trading systems. Interestingly one of the reasons C++ won over Objective C on the trading floor was because you could hack C++, Objective C was too object oriented and that slowed things down, both C++ and Objective C were great languages. The pressure to compete was so great that we even rolled out Windows NT 4 beta onto a live dealing room, that was great for my CV (resume) when it came to Windows NT work in the late 90s.
A year or two later along came Java, at the start it was a novelty language for making the pictures move on the web, a few years later that we started to see Java being used in production, the rest, as far was Java is concerned, is history.
By the late 90s it was unusual to see anything take more than a few hundred milliseconds and although most places were still using 10BaseT the faster dealing rooms had switched to 100BaseT or faster. The pressure to compete was still there and growing as the market became more and more global. As soon as the bank was over taken and became slower than average they had to re-invest to move back up the ladder. The cycle was anything from 3 to 7 years. Some banks took more risk than others and could jump further up the ladder thus extending the cycle. Others were more conservative and waited for the brave to test out the technology before accepting it. This pattern of early adopters and late comers was very evident with J2EE. The "thought leaders" were playing with EJBs back in the late 90s,. Some of these early adopters became experts and went on to find new ways to overcome EJB's limiting restrictions, I'm sure you'll know the names of people like Rod Johnson, Cedric Buest, Tyler Jewell, Robin Roos, etc. These guys were trying to work with EJBs back in the 90s and wrote books on the subject a couple of years later. Out of this came Spring, Hibernate, JDO and a lot of other interesting alternatives. At the same time Sun were still trying to keep their baby alive, EJBs are still on life support.
So, grid, where are we with grid? Assembler, C, C++, Java, J2EE and now grid. The leading banks dropped a lot of their J2EE and moved into grid in the early-mid 2000s (the naughties). Today the vast majority of banks are running some sort of compute or data grid, I'm sure there must be a few but I don't know an investment bank without it. Like most technologies grid is a rather lose classification, some systems fall within the classification but other might seem better classified as clustering or distributed ESB groups.
If you look up Grid in Wikipedia there's no simple answer, it starts... "Grid computing is a phrase in distributed computing which can have several meanings" it goes on to describe a number of meanings. While something like SETI@home is technically grid, it's not the grid I'm addressing here, "my" grid is a network of servers or blades in one or a few local subnets. Technically anything from as few as 2 or 3 machines (also known as a cluster) to several thousand. It's safe to say grid is distributed computing, similar to parallel computing but there's a larger element of distributed processes.
An interesting reason why grid has taken off is the sideways move in Moore's law. In the past it was simply the clock speed or bus width that changed as we moved up the Moore's law graph. In the past we could write a little for-loop and expect it to run roughly twice as fast every 24 months, that's 1,000 times faster in 20 years. Over the last few years though this little for-loop will still be running at the same speed. Now clock speeds have peeked as distributing heat has become a major issue. We've already reached a brick wall at around 3-4GHz so to continue we've moved "sideways" into multi core. The difference today is that you can run three or more for-loops at the same time without slowing down the first one.
Multi-core has come at a cost though, we finally have to embrace concurrency, something many programmers have happily ignored to date. Concurrency isn't just a Java problem, most of the office applications we use today have had to be largely re-designed to make use of these extra cores in the same machine. You can't simply leave it to the operating system to guess how to parallelise your application and very few applications were written with this idea in mind back in the 80s, 90s and early 00s. More work for the programmers, more money, we're all happy.
Over the last few years and interestingly just a few weeks ago I have been asked to review architectures and code. The recent one was very typical in that it had been designed in modules and nicely de-coupled using queues. In this case the they had assumed several JVMs running on one or several machines linked via a queuing mechanism, not a bad idea on paper but in practice it runs like a dog. Every single call to another module has to be serialised and de-serialised over the network, local host in many cases but there's no optimisation for local calls. This was a problem experienced by the early EJB programmers, everything went through the remote interfaces even if it was running on the same machine, stateless to entity bean for example. Sun applied a "patch" by introducing the local interface. While fixing the remote problem it meant that the programmer had to dictate the topology of the deployment, it worked for performance but wasn't a great fix from a design point of view. Spring provided a much more elegant solution offering abstraction of location. By looking at the class loader it could determine whether the calculation should be local or remote.
As we move into the mature age of grid, many of these teething problems have already been sorted. Today a modern grid framework provides a location abstracted interface allowing the programmers to simply program is if they're working on a single VM. In the Java world a few key vendors stand out, GigaSpaces, Tangosol (now Oracle), Gemstone and Terracotta, there are several new grid frameworks on the scene but I haven't seen these in the banking world yet. There's plenty of space for everyone but it's only the list above that seem to be making any money out of grid. In a nutshell I'd place the niches of these vendors as GigaSpaces, best implementation of master/work pattern, Tangosol, best data caching, Gemstone, most mature with a niche in native C and Terracotta, a nice open source option.
Now as soon as this is posted I know each of the players will comment saying that theirs is the best over all product and that the others are simply crap. Each will be able to dig up examples of where they've won over the competition and each will be able to demonstrate holes in the other's architecture but at the end of the day they all work well if you chose the right tool for the problem.
The reason each of these are different is the API they've chosen to access what's under the hood. GigaSpaces have originated from the JavaSpaces world, part of Jini and older than J2EE. The core interface is therefore Sun's JavaSpaces API, an incredibly simple, 4 method, API that allows the programmer to read, write, take and notify (be notified). While this works nicely for data is was primarily designed as a services interface. If I was designing a Java API for services, this (with a few alterations) would be it. To communicate from one application to another an object (which should implement a no-method interface called Entry) is written to the "Space", a container, and the recipient either reads or takes the Entry after an optional notification. This can work transactionally point to point and in a publish/subscribe architecture.
Tangosol's Coherence, recently snapped up by Oracle, grew out of the straight forward caching needs of the early J2EE days. The API is simply a Hashmap, every Java programmer knows the API out of the box. Coherence is a great for storing and retrieving Objects. Since the maps can be transactional communication can take place by two applications sharing the same map, one puts an object into the map, the other gets it out, Publish/subscribe has the rather interesting feature of "last value", something missing from the usual pub/sub messaging vendors.
Gemstone was originally an object-oriented database, but like the previous two examples one application can write the object into the database (in memory) and the other application can read it.
Perhaps you can see a pattern here, objects being written to containers, some local some remote but in both cases the container is distributed and the mechanism is hidden from us through the API. This then is what a Java grid is, a distributed container.
Normally I'd stop there, I think I've answered the questions however I'd like to point out some interesting trends that are emerging from the use of grid; The first is the use of memory over disk. In the past an average day's trading would need to be written to disk, although we were much more efficient with data in the early days, i.e. we didn't use XML we still had to read and write on and off disk, the addressable memory was bounded by the processor addressing limits, 16 bits can only address 64k, 32 bits 4G but 64 bits will address 18 million TeraBytes, to top it all ZFS, Sun's new file system uses 128 bits, the argument goes that it's all you will EVER need. Being able to address vast amounts of memory and with distributed frameworks hiding local and remote storage we can now seriously design with the expectation of hundreds of GB of RAM available, replicate this and why would we ever need to touch disk other than for long term storage. The trend is to now use memory for most day to day work and disk where we used to think of archiving, amusingly this trend is self perpetuating, the more we use it the more we need of the technology etc.
The second trend it perhaps more of a result of globalization than grid but grid and the use of distributed memory has been a serious enabler. This is the increasing use of objects and hierarchical structures in memory as opposed to being centred around relational databases. For the reasons above we can reduce our dependence on disk by storing data in memory, a Hashmap or JavaSpace is after all a perfectly good way to store something for a few seconds, minutes or hours. Why re-map this to a relational structure then when the data can usually be stored in its original form, perhaps an object or XML. Queries are no problem and the increased number of threads means we can even distribute queries. Perhaps GemStone were the first to see this but I'm sure I'll be corrected. As a result grid is replacing the need to complex ORM layers, vastly simplifying design. Not only does grid solve our business problems but it also improves time to market and maintainability.
The clever vendors are now programming to a grid culture, simple thread-safe POJOs with self-contained functionality and defined behaviour, you and I know these as objects. As we strive to get every last cycle out of the CPU we're looking to make the most out of the CPU's direct memory. Distribution is inevitable but like the way a CPU's memory cache works we try to optimise the network usage to a minimum, advances in CPU technology and the ever-diminishing price of memory is playing right into our laps. Grid, however you define it, is here to stay. Not only does it solve many of today's problems, it is also a very viable implementation of a Service Oriented Architecture (SOA) and the business people and analyst love that.
About the author
John is CTO/co-founder of C24 and Technical Director of IONA. Founded in 2000, C24 hit the market a year later with an innovated Java-binding technology for financial services messaging standards. Integration Objects (IO) can generate Java code for almost every financial services messaging standard, from FpML (including validation rules) to SWIFT. It is unique niche in the market has seen it OEMd by many of the leading middleware, messaging and application server vendors. The largest clients now feed over $500million of trades a day through the code, the fastest process thousands of messages second.
The wide coverage of C24-IO has given John a unique view into the internals of many of the world leading financial institutions. John has nearly 20 years in Investment Banking and over 25 years in IT, mostly as a consultant. He has co-authored several books on Java and J2EE, was the author of Learning Trees distributed Java course and a regular speaker on grid, Jini and JavaSpaces in the Java and banking world. Over the years John has held more than one high-profile position as Head of Technical Architect in banks such as JPMorgan.
You fail to mention Infiniflow. Industry's only OSGi / SCA based distributed self-healing service fabric. In addition to providing traditional Grid type behavior for Spring and Java POJO based applications. Infiniflow replaces the very concept of Application Service and Messaging Middleware - with a dynamic OSGi based service based runtime.
Come and see our presentation at NYJavaSig next week.
A couple other things
Re: A couple other things
Now as soon as this is posted I know each of the players will comment saying that theirs is the best over all product and that the others are simply crap. Each will be able to dig up examples of where they've won over the competition and each will be able to demonstrate holes in the other's architecture but at the end of the day they all work well if you chose the right tool for the problem
I''ll save you and the rest of the community that pleasure this time:)
I would however point you to a resent discussion on this topic here and here which provides further clarifications on GigaSpaces position as it relate to Grid, Scalability, Caching etc.
Write Once Scale Anywhere
Another point is that for me GigaSpaces is much more of a data grid than a compute grid (master/worker has nothing to do with either one – it can be implemented in any product what-so-ever). Yet another point is that Terracotta’s strength is not in open source license (which requires attributions in its case, btw) but in unique technological approach of clustering low-level Java semantics (like synchronization, etc.). More over, GigaSpaces too has open source option (OpenSpaces) albeit limited.
Having certain actual experience in this area I can say that the key characteristic of any mature *compute* grid product is rich support for Map/Reduce logic (which is an essence of parallel computing in general dating back to MPI time). Recent fascination and confusion about Master/Worker pattern in relation to grid computing is rather startling: since when an old rusty RPC became a staple of grid computing :-)
GridGain - Grid Computing Made Simple
Re: Few comments
Recent fascination and confusion about Master/Worker pattern in relation to grid computing is rather startling: since when an old rusty RPC became a staple of grid computing :-)
Since JavaSpaces vendors had to find a way to be "grid" vendors!
(Sorry .. I couldn't resist .. ;-)
Oracle Coherence: Data Grid for Java and .NET
Re: Few comments
John's article was one of the best, and balanced, I've seen on the state of things in a long time. From a customer perspective, it's hard to put a value on how useful this is. We're still fighting a rear-guard action against point-to-point integration, propping up badly done EAI from the nineties, having to sift through SOA-bollocks from 'consultancies' (while the board continually asks if it's done yet), rationalising ESB use, and now worrying about EDA and XTP. As an architect it's hard enough convincing the business that any approach is worth the extra effort in these circumstances, and if the convincing takes too long, the developers have usually gone off and dropped another tactical solution into operations for us to worry about next year.
What adds wonderfully to this is vendors redefining concepts and pointing out perceived flaws in competing products rather than highlighting the particular sweet-spots for their own. I implemented a master/worker pattern for order management in GigaSpaces a while back, and having recently spoken to the business director that still owns it, can attest that, as John says, we chose 'the right tool for the problem'.
However, these days I'm working for a trading organisation, real-time events and parallel processing are high on our agenda, there are subtle (and sometimes not so subtle) differences in operational semantics of the front and back offices that will influence our choices. As we move toward product selection it's information that’s useful, not squabbling.
Re: Paremus Infinflow
You're right and I'm sorry I didn't mention Infiniflow, there was a point when I thought, shall I make an exhaustive list of players at this point or just stick to my usual "group" of vendors, well you can see the direction I chose. You guys have given me a great demo of Infiniflow and Newton and it's definitely the most sexy grid demo I've seen to date. I know some of your customers but unlike the ones I talked about I've not had hands-on experience so felt unqualified to talk about what you do in detail. I hope this will change some time soon.
Re: Few comments
I saw your presentation at JavaZone, GridGain is seriously cool but as per my comments above, I've not had a huge amount of hands on and I've not seen it in any of the banks I'm working in yet (YET) so I avoided it, perhaps wrongly. It's the simplicity of GridGain that makes it so powerful and I expect to see it playing a bigger part of Java-based grid in the coming months.
I did make a conscious effort to avoid DataSynapse and Platform despite having used both in a large bank, one of whom currently funds one of the above. My reason for avoiding them was in this paper was simply because they're not Java specific and I was positioning this towards Java readers. I've now dug myself a small hole because this means I really should have mentioned GridGain, version II perhaps.
I disagree with you about JavaSpaces being more data-grid than compute-grid but as you say it just depends on how you chose to use it, the point we are both making is that you can use it either way.
JavaZone -> Oslo -> Norway :)
Thanks for a great time guys.. Kirk, Cameron, John - great presentations, great fun..
"I caught up with John Davies while at JavaZone in Olso Denmark."
JavaZone -> Oslo -> Norway :)
For the American readers, Oslo, Denmark and Norway are all in northern Europe. :-)
I know what Kirk meant to say, we actually met up at JavaZone in Oslo, Norway and then the following week at JAOO in Arhus, Denmark. Last week we met in London and later this week we'll meet again, this time for some wine tasting in Tokaji, Hungry rather than Java. I'm sure the topic will come up though as I'll "stuck" be in a wine cave on Saturday while England play in the Rugby World Cup final. :-(
Oracle Coherence: Data Grid for Java, .NET and C++
Delivering Performance Under Schedule and Resource Pressure: Lessons Learned at Google and Microsoft
Ivan Filho Mar 06, 2014