InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Scaling Web Applications using Cache Farms and Read Pools

Posted by Gavin Terrill on Nov 12, 2007

Sections
Development,
Architecture & Design
Topics
Performance & Scalability ,
Architecture ,
Clustering & Caching
Tags
Deployment ,
Load Balancing ,
Memcache ,
Database

Michael Nygard, author and No Fluff Just Stuff speaker, recently wrote on two alternative approaches to scaling web application performance and scalability: Cache Farms and Read Pools.

The idea behind Cache Farms is that application nodes in a cluster share an external cache instead of each maintaining their own. This eliminates redundancy and gives back heap space to the application server:

By moving the cache out of the app server process, you can access the same cache from multiple instances, reducing duplication. Getting those objects out of the heap, You can make the app server heap smaller, which will also reduce garbage collection pauses. If you make the cache distributed, as well as external, then you can reduce duplication even further.

Read Pools take advantage of the fact that most data driven applications perform many more read operations than writes. By having the reads performed against a dedicated set of read only replicated databases, you can relieve the burden on the write operation databases:

How do you create a read pool? Good news! It uses nothing more than built-in replication features of the database itself. Basically, you just configure the write master to ship its archive logs (or whatever your DB calls them) to the read pool databases.

Michael points out that updating the read hosts may not happen in real time depending on what database you are using, but notes that this might be a perfectly acceptable tradeoff. MySQL users can take advantage of Read/Write Splitting with MySQL-Proxy.

Michael concludes:

The reflexive answer to scaling is, "Scale out at the web and app tiers, scale up in the data tier." I hope this shows that there are other avenues to improving performance and capacity.

Coherence cache farms by Cameron Purdy Posted
GigaSpaces recommended by Geva Perry Posted
  1. Back to top

    Coherence cache farms

    by Cameron Purdy

    The "cache farm" feature is a popular feature of Oracle Coherence. Benefits include:

    * Dynamic scale-out, i.e. easily adding and removing servers without interruption to the application (and without losing any data).

    * Configure any level of redundancy, including no redundancy.

    * By using dynamic partitioning, Coherence linearly scales out both cache capacity and throughput.

    * Read-through and read-coalescing for database access.

    * Write-through and write-behind for database updates, including write-coalescing.

    * Ability to layer caches, e.g. small on-heap caches layered on top of a large out-of-VM partitioned cache.

    Peace,

    Cameron Purdy
    Oracle Coherence: Clustered Caching for Java and .NET

  2. Back to top

    GigaSpaces recommended

    by Geva Perry

    Later in his post, Michael Nygard recommends GigaSpaces as a commercial cache farm implementation:

    On the commercial side, GigaSpaces provides distributed, external, clustered caching. It adapts to the "hot item" problem dynamically to keep a good distribution of traffic, and it can be configured to move cached items closer to the servers that use them, reducing network hops to the cache.

    And in another post he writes:
    What can I say about GigaSpaces? Anyone who's heard me speak knows that I adore tuple-spaces. GigaSpaces is a tuple-space in the same way that Tibco is a pub-sub messaging system. That is to say, the foundation is a tuple-space, but they've added high-level capabilities based on their core transport mechanism.

    So, they now have a distributed caching system. (They call it an "in-memory data grid". Um, OK.) There's a database gateway, so your front end can put a tuple into memory (fast) while a back-end process takes the tuple and writes it into the database.

    Just this week, they announced that their entire stack is free for startups. (Interesting twist: most companies offer the free stuff to open-source projects.)...

    I love the technology. I love the architecture.

    To check out our free offer to start-ups and individuals go here.

    Geva Perry
    GigaSpaces: The Scale-Out Application Platform

Educational Content

Jesper Boeg on Priming Kanban

In this interview, Jesper Boeg, author of the new InfoQ book – Priming Kanban, discusses the keys to using Kanban effectively, and how to get started if you are currently using other approaches.

New-age Transactional Systems - Not Your Grandpa's OLTP

John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.

Cool Code

Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.