BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News POJO Messaging Architecture with Terracotta

POJO Messaging Architecture with Terracotta

This item in japanese

Bookmarks

Mark Turansky detailed his implementation of a POJO message bus architecture using Terracotta and Java 5. Instead of using an MQ or JMS based deployment, Mark took advantage of the Terracotta architecture to create his POJO message bus. This allowed for a clean, simple, and inexpensive infrastructure solution to his message needs. One item of note about the process was:

Our second implementation... used JMS [because] ActiveMQ is also open source, mature, and Camel looked very cool insofar as they give you a small domain specific language for routing rules between queues.

There were some problems getting ActiveMQ to work well for them. This led to reviving research into using Terracotta . Thanks to good design, the team swapped out the old messaging structure and brought in the new one with relative ease:

All JMS-related code was hidden by handler/listener interfaces. Our consumer logic did not know where the messages (our own domain objects) came from. Implementations of the handlers and listeners were injected by Spring. As a result, it took just 90 minutes to swap in a crude but effective queueing and routing system using Terracotta. We've since cleaned it up, made it robust, added functionality for business visibility, and load tested the hell out of it. It all works beautifully.

The main components for the messaging system included:

  • Java 5's concurrent API for queueing (especially bounded java.util.concurrent.LinkedBlockingQueue)
  • some knowledge of Java threading
  • insight into long running Java processes (with help from how Tomcat does it)
  • an above average understanding of classloaders and how to name them
  • and how to "spool" requests

The rest came from investigating and understanding how Terracotta works with classloaders, objects in clustered memory, and concurrency. Terracotta itself provided the messaging backbone and glue:

Last but not least, you need some queues, you need multi-JVM consumers, you need persistent data (a message store) that won't get wiped out with a catastrophic failure, you need business visibility to track health and status of all queues and consumers, and you need to glue them all together. Terracotta Server handles these requirements very well.

In the end, Terracotta allowed for using the Java 5 concurrent API to provide "in memory" POJO message queues for the Producers to post onto and for competing Consumers to pull from. By implementing the Consumers as long running daemon processes using bootstrap loaders, the architecture provided grid computing capabilities with POJO simplicity.

Mark also described the developer friendliness of the architecture:

[Terracotta] lets us make an entirely POJO system that runs beautifully in IntelliJ. A single "container" type main program can run all our components in a single JVM simply by loading all our various Spring configs. Developers can run the entire messaging system on their desktop, in their IDE, and run their code against it. They can post messages to an endpoint listening on 127.0.0.1 and debug their message code inside the messaging system itself.

The implementation allowed them to have simplicity, scalability, and reliability for little cost. Mark also said that rolling his own minimalist framework alllowed his team to have large amounts of freedom and flexibility in the implementation. They were able to avoid being locked into any heavy-weight application server API dependencies and maintain a very small footprint (the entire implementation is in a 100KB jar file).

The Terracotta FAQ says that they do not advocate replacing JMS with Terracotta but Terracotta, Inc. CTO Ari Zilka liked what Mark had to say.

In related news, Jonas Boner had details about using Terracotta to cluster Scala actors to perform Fork/Join or Master/Worker processing.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • SPOF

    by Jason Carreira,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Doesn't Terracotta leave you with a Single Point of Failure? If I recall correctly, they've always got the one machine that's serving to register nodes and send and receive updates, right? If that goes down, doesn't your cluster / grid go down? How does that provide reliability?

  • Re: SPOF

    by Geert Bevin,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    You can setup a passive server that is ready to take over in case the active server goes down. This switch is performed in hundreds of milliseconds. You can find more information about that here: www.terracotta.org/confluence/display/docs1/Con...

    This might also be interesting: blog.terracottatech.com/2007/07/fud_of_the_week...

    Hope this clarifies things.

  • Re: SPOF

    by Dmitriy Setrakyan,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    This actually confused me a bit.

    I get the fail-over part with active-passive deployment, but what about scalability? Can you have multiple active servers that work in sync in order to accommodate more load concurrently?

    (I could not find any docs in this regard)

    Best,
    Dmitriy Setrakyan
    GridGain - Grid Computing Made Simple

  • Re: SPOF

    by ARI ZILKA,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    This actually confused me a bit.

    I get the fail-over part with active-passive deployment, but what about scalability? Can you have multiple active servers that work in sync in order to accommodate more load concurrently?

    (I could not find any docs in this regard)

    Best,
    Dmitriy Setrakyan
    GridGain - Grid Computing Made Simple


    Take a look at our customers:
    www.terracottatech.com/customers.html

    As these customers and more know, TC can handle quite a lot of load. And, yes, it can be scaled out to multiple active instances. Thanks for asking.

    Anyways,

    Exactly what is the point of one vendor asking a pseudo-question of another. This type of thing is just silly.

    --Ari

  • Re: SPOF

    by Dmitriy Setrakyan,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Relax Ari,

    I am not asking this question as a vendor, and showing a customer list does not give me an answer.

    This active/passive deployment really looks like an architecture hole in Terracotta, not a feature. It resembles a common database-like deployment when users get a very powerful and very expensive DB box to accommodate a huge load coming from a cluster of application servers. In fact you are asking users to buy at least 2 such boxes to account for a passive node.

    If you support multiple active servers, all the best to you. However the initial reply in this thread points to a blog that says exactly the opposite.

    Can you please point me to the Terracotta documentation where you describe how you support multiple active servers? So far it looks like you are really hiding such an important advantage of your product.

    Best,
    Dmitriy Setrakyan
    GridGain - Grid Computing Made Simple

  • Re: SPOF

    by Iurie Ignat,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Doesn't Terracotta leave you with a Single Point of Failure? If I recall correctly, they've always got the one machine that's serving to register nodes and send and receive updates, right? If that goes down, doesn't your cluster / grid go down? How does that provide reliability?

  • Re: SPOF

    by ARI ZILKA,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Relax Ari,

    I am not asking this question as a vendor, and showing a customer list does not give me an answer.


    You ask me to relax but you are throwing out inaccurate statements based purely on conjecture (and even setting key words in BOLD trying to point out "hole"). What is the goal of such things? To educate yourself? Seems more like FUD. Anyways, I will ignore that stuff and I will simply answer your questions about the technology.


    This active/passive deployment really looks like an architecture hole in Terracotta, not a feature. It resembles a common database-like deployment when users get a very powerful and very expensive DB box to accommodate a huge load coming from a cluster of application servers. In fact you are asking users to buy at least 2 such boxes to account for a passive node.


    I should have made myself more clear. These customers _all_ use Terracotta on small commodity hardware such as dual or quad core servers with between 2 and 8GB of RAM. Machines that range in price from $1k to $10K. So, this is _nothing_ like the database. Thanks for _sort of_ asking that question. It is an important point to make about TC and how we solve the problem of scaling out in an easier, lower cost way.


    If you support multiple active servers, all the best to you. However the initial reply in this thread points to a blog that says exactly the opposite.

    Can you please point me to the Terracotta documentation where you describe how you support multiple active servers? So far it looks like you are really hiding such an important advantage of your product.


    I just quickly re-read my blog. It doesn't say that we cannot run active / active. Actually, it says the opposite...And I quote from the last paragraph: "Also note that in many cases, it is possible to chop up your domain model so that it runs on more than 1 Terracotta Server."

    As for documentation of the various options for Active / Active I cannot point you to anything at the moment. We have customers using the feature, but it is not documented for the public yet. Active / Active is something we take very seriously and will roll out carefully over the next year. The point of listing all those customers was that none of them needs active / active and some of their use cases are VERY LARGE (such as PartyPoker--the largest online poker site in the world).

    last, I am just finishing up Terracotta's first book. You should buy it as Chapter 12 covers Active / Active in detail. Here's the link:

    www.amazon.com/Definitive-Guide-Terracotta-Avai...

    --Ari

  • Re: SPOF

    by Jason Carreira,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    You can setup a passive server that is ready to take over in case the active server goes down. This switch is performed in hundreds of milliseconds. You can find more information about that here: www.terracotta.org/confluence/display/docs1/Con...

    This might also be interesting: blog.terracottatech.com/2007/07/fud_of_the_week...

    Hope this clarifies things.


    Only a little bit. What if the backup server goes down after it switches over?

    What happens to transactions that are going on against the primary server if it goes down? Does terracotta replication interact with transactions? Is it a transactional data store?

  • Re: SPOF

    by ARI ZILKA,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Short answer: terracotta is not a SPoF.

    Jason...you are asking several good questions. The answers are:
    1. 2 node failure is a double failure. While many other architectures don't concern themselves with protecting for this scenario TC handles it in many ways. First, we can run N secondaries, not just 1. Second, the primary can be returned to the cluster and will become a secondary till the new primary fails. Terracotta fully protects the application from our server failures.

    2. The system is 2-phased. Nothing will be lost during a fail-over regardless when during or after the handshake between JVMs and terracotta the failure occurs.

    I suggest that if you are new to Terracotta you read more at www.terracotta.org/ as the original author of this piece had no intention of providing an exhaustive treatment of how Terracotta works. I could see someone easily needing more detail than is presented in this article and the .org site is the best place to go next.

    Also, do take a look at the customer list posted earlier. These customers represent household names. They have all tested fail-over and back again with Terracotta quite exhaustively and are in production today after validating proper behavior in many failure scenarios.

    --Ari

  • Re: SPOF

    by Nikita Ivanov,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Ari,
    Talking about some sleep deprivation on book writing… :)

    “Relax” meant in the best way here, btw. The problem with Terracotta’s one active and many passives approach is that without further deep explanations this architecture looks like a joke. In today’s grid computing world having a system that supports only one box for all essential processing is simply ridiculous. That’s what comes out when you *casually* read about active/passive.

    So, Dmitriy asked a very polite question to get more information (as we know that there’s more in Terracotta). Now, in my understanding you, of course, “support” multiple active servers as long as locks don’t “cross” – essentially supporting multiple active servers for *multiple different* applications. That’s again a bit of a joke (sorry for repeating) – and I would love to know how my locking and data distribution can scale on 2-3 active servers for the *same* application where each active server’s picking up load distribution.

    I actually share the opinion, by the way, that in many cases you DO NOT need this level of scalability from Terracotta and powerful commodity box can support a lot. But question still remains. Things like Coherence and GigaSpaces, for example, do the similar job for data (data partitioning).

    Waiting for the book to come out to see how you organize it and package the material. We are thinking/being asked about doing the same.

    Best,
    Nikita Ivanov.
    GridGain – Grid Computing Made Simple

  • Scalability & High Availability w/ Terracotta Server

    by Mark Turansky,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    This has been a very interesting thread to read, and since I'm the author of the original article in question, I thought I should reply here. The more I wrote, the more I thought the reply would make a better full length blog article.

    Scalability & High Availability with Terracotta Server

    Overall, TC scales for what we need it to do. It's highly available for us. It fits our needs. I'm sure there are problems out there where GridGain is the better solution, but for our problem, TC was a great fit. Horses for courses.

    Mark

  • Re: Scalability & High Availability w/ Terracotta Server

    by Nikita Ivanov,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Hi Mark,
    I’m in no way comparing Terracotta and GridGain. We are solving different problems in different ways. It’s just that Terracotta historically has been “challenged” by explaining their really cool technology or highlighting the best use cases... No wonder Ari had to create the whole genre of “Exposing the FUD” about terracotta on his blog.

    Anyways, I would still love to learn more about real multiple active support in Terracotta.

    Best,
    Nikita Ivanov.
    GridGain – Grid Computing Made Simple

  • Re: Scalability & High Availability w/ Terracotta Server

    by ARI ZILKA,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    This is not teaching or helping the InfoQ reader in any way shape or form.

    Also seems important for you to keep saying Terracotta is "broken" or has a "hole" or is otherwise "challenged."

    I am confident now that no matter what I say, you will find a way to add another back-handed compliment with a thinly veiled attempt to deposition our technology (I cringe even posting this response). This is classic FUD and I am sure everyone can smell it.

    You do yourself a disservice every time you post like this on other vendors threads. You also insult every reader's intelligence by attempting to represent yourself to be a bigger expert in our respective technologies than we, the vendors, are.

    I will just end here by thanking Mark for clarifying his level of scale and availability. Very interesting and well explained! Everyone should read it as Mark's clarity on how to scale, even across the WAN, is quite refreshing.

    'nuff said,

    --Ari

  • Re: Scalability & High Availability w/ Terracotta Server

    by Nikita Ivanov,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Hm, ok. Grow some thicker skin man, you'll need it.

    I'll keep it short, Ari: can you actually answer the technical question that has been repeated 3 times in this thread? I actually genuinely would like to know the answer (b/c the technical problem there is far from trivial).

    Take it easy,
    Nikita Ivanov.
    GridGain - Grid Computing Made Simple

  • Re: Scalability & High Availability w/ Terracotta Server

    by Dmitriy Setrakyan,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Hi Mark,

    Thanks for a very interesting to read blog. I am glad that Terracotta works out for you.

    However, you said it yourself that "you don't care about scalability of the Terracotta server" as most of the load goes onto clients and the server is barely touched.

    My question on the other hand was about scalability and Ari was not able to provide any response other than dismissing the whole question as FUD ("great" way to handle a technical discussion, btw).

    At this point it's pretty clear to me that Terracotta DOES NOT scale beyond one Active server. This, in my view, is quite a substantial technical limitations. It may be great for scenarios like you described in your blog, but can turn out to be pretty much a show stopper for systems that do require and care a great deal about High Scalability.

    Best,
    Dmitriy Setrakyan
    GridGain - Grid Computing Made Simple

  • Re: Scalability & High Availability w/ Terracotta Server

    by ARI ZILKA,

    Your message is awaiting moderation. Thank you for participating in the discussion.


    At this point it's pretty clear to me that Terracotta DOES NOT scale beyond one Active server. This, in my view, is quite a substantial technical limitations. It may be great for scenarios like you described in your blog, but can turn out to be pretty much a show stopper for systems that do require and care a great deal about High Scalability.


    This is not correct. I am not interested in this debate, but I want to make sure the average reader sees the truth. How to scale an app with terracotta underneath:

    1. The first option is named Co-resident L1's works today. Under this option, when using EHCache or a map, Terracotta can transparently split the data underneath the collection. For example instead of just listing Active / Passive servers by IP address such as 192.168.1.1, 192.168.1.2, 192.168.1.3 which would give us 3 TC servers in active / passive /passive config, we merely list them with "#" separating them to mean split the data across all three: 192.168.1.1#192.168.1.2#192.168.1.3, 192.168.1.4#192.168.1.5#192.168.1.6 would give us 6 servers, 3 live at a time. The 3 live servers will split the data 1/3 each. This works on top of terracotta 2.5, today.

    2. partitioning / sharding. Even though Mark was improperly quoted above, he did point out that he plans to duplicate his entire cluster with stateless load balancing logic in front, thus giving linear scale. For example, he could take his message bus, duplicate it, and then mod (%) on message ID. All even messages go to one message bus instance and all odd messages go to the other bus intance. People have been doing clustering this way for a while. Works great for anything--not just Terracotta as mark stated. It is outside the product though so quite different from option #1.

    3. Active / Active V1.0 will come out this year. Unlike Co-resident L1's in option 1, it will not require EHCache or a map interface and will split all POJO data (even as fine-grained as field-by-field) onto a TC cluster.

    Note that in all modes, scaled out Terracotta servers is separate from highly available Terracotta. You can choose to split data for scale or make it highly available through replication amongst our servers or both.

    --Ari

  • Re: Scalability & High Availability w/ Terracotta Server

    by Mark Turansky,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I think you may have misread my blog article. I never said I don't care about scalability. I said:


    People are asking questions about scalability. Quite frankly, I’m not worried about it.

    Scalability is a function of architecture. If you get it right, you can scale easily with new hardware. We got it right.


    So I'm not worried about it. It's a solved problem.

    The fact of the matter is, my system works, it works great, and it's going to process a lot of data through for my company. Based on my real world testing, I know it'll scale to at least dozens more nodes.

    Please do not misquote me to further your argument.

    I probably don't need this disclaimer, but I'll put it out there anyway: I do not work for Terracotta. I have no dog in this fight. My responsibility lies with my company that puts food on my table and a roof over my head. I found the right solution for us. You might need some very different to satisfy your requirements. Terracotta solved our's extremely well.

  • Re: Scalability & High Availability w/ Terracotta Server

    by Dmitriy Setrakyan,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Mark,

    I don't think I misquoted you (sorry if I did). You said it yourself that if messages are smaller, and the Terracotta Active server is bombarded, then CPU utilization goes pretty high.

    In such case, you can reach a maximum capacity of the only Active server you are limited to, and will need to add another, but you can't.

    The solution that Ari proposes only works in some cases, when you have disjoint data sets and can have one active server work on one data set, and another active server work on the other data set.

    Anyway, I think I understood what I needed to know.

    Best,
    Dmitriy Setrakyan
    GridGain - Grid Computing Made Simple

  • Re: Scalability & High Availability w/ Terracotta Server

    by Nikita Ivanov,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Ari,
    Thanks for the final clarification. I guess p.3 is what I was looking for. Interesting to know you guys are finally working on it.

    Best,
    Nikita Ivanov.
    GridGain – Grid Computing Made Simple

    P.S.
    Piece of advice, Ari: exchanging views on the technology is what sites like InfoQ are all about. Saying that you “not interested in this debate” just because someone has an opinion that doesn't match yours looses the argument for you right off the bat. In your 4th post you provided the technical answer that I think Dmitriy was looking for like 10 posts above.

  • Re: Scalability & High Availability w/ Terracotta Server

    by ARI ZILKA,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Folks should also read this article over at TSS:

    www.theserverside.com/tt/knowledgecenter-tc/kno...

    I hear Eugene is planning massive amounts of data flow through this architecture when it goes to production.

    Cheers,

    --Ari

  • Re: SPOF

    by Jeryl Cook,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Terracotta uses a master/worker pattern for the "Terracotta Master Server" that sync. the JVM of all the works. you can setup the master server to have a fail over(s), removing the singlepoint of failure that you mentioned.

  • Re: Scalability & High Availability w/ Terracotta Server

    by Charlie O'Keefe,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Wow, just came across this thread. The posts from GridGain leave a really bad taste.

    The conversation did drive out some issues that helped clarify some things about the technology, but the GridGain posts seem aimed somewhere well below the critical thinking areas of the reader's brain. They certainly don't appear to be writing for people with an understanding of computer science.

    And while I now understand a little bit more about TerraCotta's technology, I've learned absolutely nothing about any alternatives the GridGain guys are surely selling or about any differences in tradeoffs made by said alternatives. Nor have I been inspired to go find out.

    Cheers

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT