Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Ari Zilka on Terracotta, Clustering and Open Source

Ari Zilka on Terracotta, Clustering and Open Source


1. This is Floyd Marinescu at QCon. I'm here with Ari Zilka. Ari, tell us a bit about yourself.

Hello Floyd. Again, my name is Ari and I am CTO and one of the cofounders of Terracotta. I founded it with my coworkers from where I was chief-architect and I have spent a lot of time in and around development at enterprise scale, lots of servers, very large companies and Terracotta's goal is basically to take the experiences we have had in terms of scale and in terms of high availability and build a solution for the Java community that is about scalability and availability, but with less of an impact on the day-to-day development world.


2. What is this simple programming model that open Terracotta allows?

Our open source project is basically all we've got-there's no commercial versus open-and in this project what we are enabling people to do is to write to what looks like a single JVM but is actually cluster of JVMs, so you don't need JTA, you don't need EJBs to access temporal data, you don't need messaging to replicate state, you don't need JGroups, you don't need to do all this by hand, signaling across machines, because the machines talk to each other as if they were in the same process space. And they do that in a scalable fashion, we think because we have got a central server that is actually acting as a traffic cop enabling all these conversations between JVMs.


3. So how can you scale in a single server environment?

We do have a single server, it clusters in an active-passive mode for HA-high availability for those who don't do enterprise stuff all the time-and a lot of people are concerned about that as a bottleneck. I was shocked actually, I was having dinner with a very senior engineering leader and one of our customers a couple of weeks ago, and he asserted to me that scale for Terracotta is much better than any other clustering solution because we are not multicasting, we are not peer-to-peer and that we should be beating everyone out there at clustering performance, and I was shocked because it was the first person I have ever seen who has had this opinion. The short answer as to why he shares our opinion around scalability is that with a central server Terracotta can enable a lot of point-to-point conversations. If two JVMs are sharing an object in a two-thousand-JVM cluster, only those two JVMs need to know about changes. What I call that is an O(1) solution versus an O(N) solution. So in a peer-to-peer network you eventually have to share all the data everywhere or you are partitioning the data up. In a Terracotta network these "partition spaces" occur sort of organically based on how load is sent around a cluster and as a result, even if you don't partition data up, if it organically partitions in any way: you get sub-N performance, meaning if there are N nodes in the cluster I don't have to tell them all about changes to objects and I don't need an entire cluster to participate and acknowledge in all transactions and acknowledge all transactions. So the point-to-point capability is optimized by a central server and that's how we get scale.


4. Do you have a lot of customers working with Java primitives and actually clustering in those kinds of environments?

We have very many customers who have started with applications that couldn't be clustered and they are all Java primitives, meaning java.util.collections, util.concurrent now under JDK 1.5 and they are clustering with just those applications, just those primitives: hash maps they, flag them as clustered, linked blocking queues, they flag them as clustered and use that instead of messaging or JavaSpaces-type approaches and they are very happy. That being said we have customers who are actually going from other clustering approaches where they weren't using primitives and they were using direct API calls and they were ripping out things like JMS, [which] is a harmless example where I am not saying messaging is bad, I am not saying clustering APIs are bad, I am just saying that people when presented with the opportunity to use Java primitives will go back to the Java primitives over big enterprise-y type architectures.


5. That is very interesting. Tell us a bit more about how can you replace JMS with concurrent data structures?

Absolutely. The way most people are doing it on Terracotta today is using util.concurrent, the stuff from Oswego that got folded into the JDK. They are using linked blocking queues with simple operations to push and pop data on those queues. Flagging those queues declaratively, meaning inside our config files, saying those queues are shared, then using thread pools to just say, for example, in one JVM I have five threads, pulling work off of a queue and doing work, and then if I cluster that application to ten nodes I now have 50 threads working on the same pool of work and in fact you could see that demo inside our kit if you download our software; we built that as a demo for people to just be able to run or extend and we also have a Terracotta Forge that allows people to get to the queues implementations without having to write them with the primitives directly.


6. It sounds like it simplifies things, but also could make things more complex, I mean multithread programming is hard. So with Terracotta, are a lot of people going back and doing computer-science-type problems?

Right. I love that question. You're not the first to ask me that question, I actually got asked that question by one of the biggest banks in the world and they actually paid us to write a paper analyzing the complexities introduced by Terracotta and what I came up with, that they seemed satisfied by is: the whole world for the Java developer can be ignoring the notion of just a main, where you write your own application, because there are some applications that cannot cluster, should not cluster, makes no sense to cluster. Setting that kind of thing aside, when you do have an application that's long lived, that has state that you want to keep shared across machines, or when you want to take a business problem and divide it amongst a bunch of servers, then what we found is that this notion of the app[lication] server is a very generic notion, it's not a J2EE versus J2SE-only kind notion, it's not web app or EJB, it's none of those things. The notion of an app-server should be thought of as container for constraining development in an efficient way relative to the business problem. What do I mean by that? Well, in banks you see a lot of need to divide-and-conquer. I need to process more data than can be processed in a single day. So to divide-and-conquer, I want to scatter out all the work and gather up all the sub-answers when all the workers are complete, it's sort of the Google MapReduce, and if someone sold Google in a box, I think they would have many customers in financial services and telco who would buy.

But that's just a container to Terracotta. To Terracotta clustering POJOs is not enough for many people; you need to provide them containers and that is where the Terracotta forge comes in, where we basically say: "We are going to build for people the abstractions that they need and those abstractions will also be open-sourced so that if the abstraction is not exactly the way you want it, you can add to it, edit it, shrink it, whatever you want to do. But the containers that I found that exist are: the POJO container, meaning "cluster this primitive", and that will be enough for me to build a clustered version in my application. And if you have the POJO container then you can build framework clustering on top of that, because most frameworks are built on top of primitives. Then you need the Grid container or the distributed workload container, then you need J2EE containers meaning enterprise Java container because people have already written such applications and you have to support those, you need a web application where you're essentially clustering session and that's about it. So the POJO container turns out to be able to implement all the other containers using basic primitives: thread pools, queues etc. and then we flag all of that as clustered and then you use that as a spring board to get into a project design. For some people, Terracotta finds that they really want access to the core computer science, the theory of clustering things by hand, but for most people the containers is how they want to get started.


7. Along with supporting clustering primitives Terracotta also supports sessions and spring constructs, can you talk a bit more about that?

Sure. Terracotta is really just about clustering POJOs, both the data and then clustering thread coordination and eventing around and inside the JVM and that is why frameworks tend to use us so Wicket, RIFE, Struts, Struts 2 as web frameworks then containers, such as Tomcat, Geronimo, Jetty: they are all cooperating with Terracotta and trying to make sure that their users can use Terracotta if they wish.

That basically leads us down this path where we are helping people cluster POJOs and primitives, but the reality is many people just want to be able to use those frameworks with, the phrase I always use is, with impunity, they want to not have to think about clustering, they want to not have to think about: "If I want to use struts and cluster it, then I need Struts plus Hibernate and I need to Hibernate all my bean state down into a database and build a stateless application." They would much rather just grab a book on Struts, use Struts the way it's documented, not have to think about all this enterprise-y-type-stuff. So what we realized is that instead of having everybody deal straight with primitives that we can prebuild solutions when people are trying to use frameworks that are largely adopted, so session and spring for us are basically something where Terracotta says: "You know what, Tomcat is the same Tomcat for everyone". So we'll cluster Tomcat for you and we work with the Tomcat guys, they're contributors to Terracotta, we work with the Spring guys and they're contributors to Terracotta and at the end of the day it means you can get started on clustered Spring or clustered Tomcat much faster than someone who is trying to cluster their own Java application from scratch by hand.


8. How does Terracotta treat object identity across the cluster?

For those who don't know, object identity it's an over-simplification, but it's the "==" sign versus the ".equals()". You don't want something to be logically equivalent, you want them to be the exact same reference, if you as a developer meant the object to be the same object. So map.get() twice should return the exact same object reference for example, and "==" should return true. We honor object identity across a cluster by using our central server as an address-space-mapper and that means that object twenty-three in server one we know is actually object five-hundred thirty-two in server number two, and what that means for the developer is a return to freedom to build the domain model the way you want to build it.

So you can put lists and maps, most of our customers in terms of use cases and case studies end up with what have now coined as maps of maps, and they do that so they can get at a subset of a cache or a clustered data object graph without getting the whole thing. If you want an iterator, for example, in a cluster, that's very expensive to page all that data around in the cluster and you can get an iterator to a subset of a cache very easily if you have maps of maps. We also allow you to put objects into multiple maps, which makes many of our customers happy because it means smaller caching spaces: you don't have redundant storage of an object in all the different indexes, you keep it. So you could have a map of customers by ID and another map of customers by zip code, but it's the same customer in both maps, which means if I go change the customer's password by ID and then I go pull that customer by zip code it's got the same password without having to reference each other only by ID, you can actually use a rich Java native reference to those objects in both maps and it's one object.


9. Describe some typical use cases where you see people using Terracotta and how does it map into your typical web app environment.

The four use cases that we figured out that people use us for are: 1. Distributing caching, meaning I have a map, or a table or a list or some sort of collection and that collection works as a cache in my application, a singleton, something like that, and I want to flag it as clustered and then if I could cluster that collection, I'd have a clustered cache. Any node that puts anything in the collection is putting it in for all and any node that reads can read anything, any node that calls get is basically getting from the super collection that all nodes put into together. So that is use case 1.

Use case 2 is a clustered session. For a web apps that's really useful, and we've got config modules now where we support out-of-the-box Tomcat, WebLogic, JBoss, we are soon to support Jetty and WebSphere: not just WebSphere CE, but WebSphere. We support Geronimo and WebSphere CE already. When we support something then you can just say: "This is the name of my web app", to answer your question directly, and we'll turn it clustered by clustering all its sessions and its contexts as appropriate, but as opposed to raw Terracotta, where you have to describe POJOs you want clustered, with Terracotta for Sessions you basically just name an application, deploy it as a WAR that you want clustered. With our config modules though you will be able to extend us to other containers that we don't support such as Resin, OC4J, things like that very easily without having to reinvent the session module yourself, so it's kind of like an uber-include mechanism where you can say pull all of this stuff in that I need and clone or copy from this other configuration, but have these slight differences, etc. So we expect the community to bring web app support for containers we don't currently support or aren't planning to.

The other two use cases very quickly are: clustered spring, where you can cluster your spring beans by name, just say spring bean named ‘foo' is clustered and not have to actually deal with O/R mapping or some kind of stateless mechanism to get the state from that bean to be replicated across nodes, it'll also cluster JMX inside spring, it will cluster spring events and spring contexts like the application context. And the 4th use case is what we call a distributed workload use case, so basically it's the Google MapReduce in POJOs.


10. So why did you go open source?

There was a lot of debate for a while when we first announced as to why we open sourced. The simple answer is the user community around Terracotta pulled us there, so we had one too many customer meeting where people said: "This is great, this is the way I want to write applications, but it's magic. I don't know how it works. I'd be much more comfortable if these were open source and I could see inside the box and that I knew this is how it works." So we had open source for our customers and we had open source for our partners. All our partners were open source. Eighty percent of our sales and our revenue comes from open source stacks: helping Tomcat customers, helping Geronimo customers, helping Struts users get to clustering faster and easier. But those customers were saying: "We would have already been on Terracotta without your sales team, without all your efforts", if it were open source and bundled with these things. And those framework inventors when we approached each of them for the bundling said: "I can't bundle with closed and proprietary technologies. If you were open source it would make a ton of sense."

So we realized that we could really lower the cost of operation and go from a traditional software company with a quota-carrying sales force running around and knocking on people's doors and bugging every developer in the world and the kind of person developers try to avoid desperately and operation folks avoid desperately. We realized that people could get to a world where they'd pull us in and they'd try us out, and if we were what we claimed and were capable of what we claim that we're capable of, that people were willing to trial us on their own, and that we could save money and benefit the community at the same time. So our data basically said that people were ready and willing to trial Terracotta on their own, without our intervention. And that was the key signal that said it's time to open source because the demand was there and people would use it on their own and it was no longer a question of: "If we open it will they come?"


11. It's been short a little while since you have open sourced. How has it gone? What's been the impact on the business?

It's completely reinvented the business. My favorite impact, and I talked to a lot of open source companies, with our CEO, about what they experienced in being open source from day one, or going open source if they were proprietary and commercial before, and what we found in general was what we have now experienced, which is: you give up your belief as to what is the crown jewels of the company.

So most proprietary company technologies believe that their core invention, their technology is the crown jewels of the company and that people are buying you for your technology, and what open source forces you to do is to give away that core technology and find a way to add value for people. And that is what has reinvented the company is we've had to forgo the notion that JVM-level clustering is something people pay for; it's something people should have a right to use when they want to use it and we have to now invent products around JVM level clustering that add value for the big customers who have money and want to pay for services around that. I mean in the short term we are doing what every open source company does, we sell support, we sell training for our projects and our products and our technologies and people come to us because we are the source and that is not the thing that's impacted as us at the core, as much as knowing that we have just given up everything that we thought we could hold against-hold the customer over a barrel around-the core technology, and now we have to go find things that actually add value and at the same time we get the community around us and we become a more fully-fledged member of the Java community because people say: "Ok. You've given up your technology for the great or good just like we have and now we'll integrate to you." So that's really, really exciting for me.


12. For a VC-backed commercial company to go open source, I am sure you guys must have had a compelling case for the investors to change. How was that vision of I would assume greater profitability been panning out so far?

It's been penning out very well, I mean without getting into specifics about how much revenue would we make or don't make, we were making revenue before and our board of directors has a great term, they call the company that is making revenue ‘pregnant', meaning you have got a business model, it's working and you are generally not capable, not willing to switch business models and start experimenting with new revenue opportunities. And thus you're pregnant: you have to get married to this business model. And we are basically working down a path at this point where we've convinced ourselves that there is a bigger revenue opportunity associated with Terracotta's open source than the vector we were on, which any company that open-sources has to convince themselves, assuming they are alive and they can survive for a while, they have to convince themselves that things are better on the other side. And of course we were able to convince ourselves of that, and what we've seen in the short time we have been open source is they are better.

So the short answer is people are trialing the software on their own and we've moved forward in the sales cycle, in the classic sales pipeline of awareness then trial, POC, negotiation, purchase and production launch, people are engaging our team at post-trial, so they've done the POC, they know it works, they are calling us up and say how much, so we are getting a lot more calls because we are integrated to frameworks that people use day-to-day, so we get calls all the time to cluster Lucene, and then we are getting calls from people later in the pipeline, so we have two-week sales cycles, where we used to have six-month sales cycles, where we have to prove to people that we could pass muster, we now have them proving to themselves. So we have a shorter sales cycle and we have more people calling us because the community has embraced us and we work inside the frameworks that people use every day.


13. So from the perspective of open source business models it seems that the successful stories we know about, and hopefully you'll be extremely successful as well, are the horizontal type: software which can be used in many environments, many contexts like, you could say that clustering is ubiquitous, it's everywhere. Everyone needs to cluster, so naturally Terracotta would work there. What kind of companies, what kind of products do you think would not be suitable for an open source business model?

There are companies who are built explicitly for OEM; they build a technology that enables a technology that eventually a consumer of that verticalized solution consumes. Horizontal products like Terracotta generally feel like they should go towards an OEM, but the difference between us and an OEM play is how broad your horizontal is. So if you're horizontal to everything done in the automotive industry, then you are not a horizontal play, you are just a slice in a vertical play. So those don't suit themselves well to open sourcing because you have just given away the core technology and you have a captive audience as a standard terminology, you have a captive audience in a small group of people to which you could ever sell and they don't care if they can see the implementation or not and no one is going to contribute to your technology.

So if you take the OEM case and bring those points into the macro world that basically gives you a framework for what not to open source, something that no one will ever contribute to outside your company's four walls, something that people don't care about the implementation of, they don't need to see it, we talk a lot in enterprise Java about leaky abstractions and agile methodologies and how to build applications with minimal design, but with design, so that they are sane and maintainable and extensible.

The notion of abstraction is difficult because the two sides of the abstraction barrier have to communicate and that abstraction has to be kind of fluid, otherwise it's not a good abstraction. It may leak over time, it may have been right: an abstraction may be correct eighteen months ago, but no longer correct. In the case of open sourcing your technology, if no one ever needs to see your abstraction and it never can leak and there is very few things that exist in that binary a way but if the ways in which your abstractions could leak are very few and very minimal impact to the applications in which you are embedded then no one really needs to see your implementation because you know that wall that people like to believe exists in terms of abstractions is very much translucent to me if you're a good architect and you build good designs people can see across to the other side and touch the people on the other side of an abstraction and communicate about their needs on each side. That doesn't mean your application knows the implementation, but the developer knows the implementation and does things smartly or ask for changes as result of his awareness. If he never needs to know the implementation then you don't need to open source it.


14. Any final thoughts for the audience?

Final thoughts are really simple. I think that Terracotta is open sourced because we want the input, we want the feedback and we want the interaction with the community. And I believe that clustering is something that people need as we've already discussed. So I'd like to see people interact with us sooner rather than later: don't wait for the use case, don't wait for the problems in production. Go to, download it, use the trial, convince yourself that it works, and then keep that in the back of your mind when you're talking about your next clustering problem or when you're at a conference listening for someone talking about RIFE or AJAX or continuations or things like that and they are saying that such and such is impossible because of restrictions across JVMs and that you have to do something with messaging or with stateless programming.

Educate yourself to the capabilities that Terracotta introduces to the community and keep that in mind when are out there building applications.

Nov 13, 2007