RAM is the new disk...
Tim Bray, in his discussions about grid computing before it became such a hot topic, pointed out how advances in hardware around RAM and networking were allowing for the creation of RAM clusters that were faster than disk clusters.
[M]emory is several orders of magnitude faster than disk for random access to data (even the highest-end disk storage subsystems struggle to reach 1,000 seeks/second). Second, with data-center networks getting faster, it’s not only cheaper to access memory than disk, it’s cheaper to access another computer’s memory through the network. As I write, Sun’s Infiniband product line includes a switch with 9 fully-interconnected non-blocking ports each running at 30Gbit/sec; yow! The Voltaire product pictured above has even more ports; the mind boggles. (If you want the absolute last word on this kind of ultra-high-performance networking, check out Andreas Bechtolsheim’s Stanford lecture.)Tim also pointed out the truth of the second part of Gray's statement: "For random access, disks are irritatingly slow; but if you pretend that a disk is a tape drive, it can soak up sequential data at an astounding rate; it’s a natural for logging and journaling a primarily-in-RAM application."
Now flash forward two years and we find that the trend in hardware advances has continued for RAM and network and stayed slow for disk. Bill McColl talked about massive memory systems becoming available for parallel computing:
Memory is the new disk! With disk speeds growing very slowly and memory chip capacities growing exponentially, in-memory software architectures offer the prospect of orders-of-magnitude improvements in the performance of all kinds of data-intensive applications. Small (1U, 2U) rack-mounted servers with a terabyte or more or memory will be available soon, and will change how we think about the balance between memory and disk in server architectures. Disk will become the new tape, and will be used in the same way, as a sequential storage medium (streaming from disk is reasonably fast) rather than as a random-access medium (very slow). Tons of opportunities there to develop new products that can offer 10x-100x performance improvements over the existing ones.Dare Obsanjo pointed out how not paying attention to the mantra can have detrimental effects, a la Twitter's issues. Commenting on Twitter's content management-like implementation, Obsanjo said "The problem is that if you naively implement a design that simply reflects the problem statement then you will be in disk I/O hell. It won't matter if you are using Ruby on Rails, Cobol on Cogs, C++ or hand coded assembly, the read and write load will kill you." In other words, push the random-access operations into RAM and only use disk for sequential operations.
Tom White, a committer on Hadoop Core and a member of the Hadoop Project Management Committee, went into more detail on the "disk is the new tape" part of Gray's quote. In discussing the MapReduce programming model, White pointed out why disk is still viable as a application data storage medium for tools like Hadoop:
In essence MapReduce works by repeatedly sorting and merging data that is streamed to and from disk at the transfer rate of the disk. Contrast this to accessing data from a relational database that operates at the seek rate of the disk (seeking is the process of moving the disk's head to a particular place on the disk to read or write data). So why is this interesting? Well, look at the trends in seek time and transfer rate. Seek time has grown at about 5% a year, whereas transfer rate at about 20%. Seek time is growing more slowly than transfer rate - so it pays to use a model that operates at the transfer rate. Which is what MapReduce does.While it remains to be seen if Solid State Drives (SSD) will change the seek/transfer ratios, many commenters to White's discussion thought that they may be a leveling factor in the RAM/hard drive debate.
Nati Shalom gave a well reasoned discussion on how memory and disk play into database deployment and usage for MySQL. Shalom highlighted the limitations of database clustering and database partitioning as means to provide performance and scale saying "The fundamental problems with both database replication and database partitioning are the reliance on the performance of the file system/disk and the complexity involved in setting up database clusters." His offered solution was to go with an In-Memory Data Grid (IMDG), backed by technologies like Hibernate 2nd level cache or GigaSpaces Spring DAO, to provide Persistence as a Service for your applications. Shalom explained IMDGs saying they
provide object-based database capabilities in memory, and support core database functionality, such as advanced indexing and querying, transactional semantics and locking. IMDGs also abstract data topology from application code. With this approach, the database is not completely eliminated, but put it in the *right* place.The primary benefits of an IMDG over direct RDBMS interaction listed were:
- relies on memory which is significantly faster and more concurrent than file systems
- Data can be accessed by reference
- Data manipulation is performed directly on the in-memory objects
- Reduced contention for data elements
- Parallel aggregated queries
- In-process local cache
- Avoid Object-Relational Mapping (ORM)
Missing the point
While it remains to be seen if Solid State Drives (SSD) will change the seek/transfer ratios, many commenters to White's discussion thought that they may be a leveling factor in the RAM/hard drive debate.
While RAM is faster than a hard drive, it's not the performance that makes the difference. The hard drive concept is "slow" because it's a shared storage model, and the RAM is "fast" because there's some of it co-located with every CPU. If the hard drives were local then the scalability would be roughly identical, and the scalability is orders of magnitude more important than the raw single-threaded latency in a large system.
Oracle Coherence: Data Grid for Java, .NET and C++
What if the disk were RAM-based?
IMO It's not just the speed of memory compared to disks that makes a difference. It's not even the extra benefit of the collocation of CPU and memory. What's really a important is the fact that disk is a sequential storage medium that was designed primarily to store a stream of bytes, not tables of data.
See my recent post on that matter for more details.
The "services" aspect
The cyclic COSS filesystem in Squid is a good choice when you *must* go to disk.
data by reference
However, you say in the advantages of IMDG over RDBMS that:
"Data can be accessed by reference"
but I've never worked on a project where this is possible. In n-tier applications, (e.g. Java server based http) you always have to look up objects by some kind of ID because of the request/response mechanism.
Recently I've been working with Flex and Java using BlazeDS in which case the object I'm manipulating in the client is serialised over the wire (Java to ActionScript to Java). Thus the object that gets passed to my invoked methods does not have the same reference and I have to do a lookup by ID anyway. (Some Adobe fan might point out that LiveCycle DataServices can actually handle this, but what is it doing under the covers? I don't know for sure, but I imagine it's passing IDs around)
In both cases this has to happen regardless of whether the storage is an OODBMS, RDBMS, some kind of fancy caching or just stored in collections.
So I guess my question actually is, can you give an example of a scenario in which data being accessed by reference is an advantage?