Cloud Foundry: Design and Architecture
Derek Collison discusses the goals, the design premises and patterns employed in creating the architecture of Cloud Foundry, VMware’s open source PaaS, unveiling internal architectural details.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Srini Penchikala on Dec 23, 2011
Riak is a key-value based NoSQL database that can be used to store user session related data. Andy Gross from Basho Technologies recently spoke at QCon SF 2011 Conference about Riak use cases. InfoQ spoke with Andy and Mark Phillips (Community Manager) about Riak database features and best practices when using Riak.
InfoQ: What are the primary use cases for using a Riak database compared to a relational database as well as compared to other NoSQL databases?
Basho Team: Riak is purpose-built for soft, real-time systems where availability is a priority. Use cases include (but are not limited to):
We tend to see people switching to Riak from MYSQL and Oracle when they a) were forcing a key/value data model into them, b) need to reduce costs, c) need to get away from a fragile scale-out model based on sharding or d) all of the above.
As far as other NoSQL DBs, Voldemort is the closest thing to Riak in terms of functionality. Cassandra is somewhat similar, but it's more suited for applications that don't require the flexible consistency that Riak offers. We see a lot of people switching from MongoDB to Riak to reduce operational complexity. What we find is that a lot of companies launch on MongoDB or Redis but will switch to Riak when costs of operating that system at scale become prohibitive. (Keep in mind that there is typically some reworking of app design and data model that needs to be done to facilitate this, but this is well worth it in the long run for Riak's stability.)
InfoQ: What type of data persistence and data management patterns does Riak support?
Basho: We place a lot of importance on persistence and predictability. To that end, we support pluggable backends that are suitable for different use cases.
There are also several other backends that ship with Riak, and some people have also written custom backends for their use cases (as we strive to make the API easy to work with). And you can use more than one backend in the same cluster; i.e., bitcask for sessions, LevelDB for data that is indexed.
Most important to remember is that Riak will remain performant even with datasets that are larger than RAM.
InfoQ: Can you talk about some limitations about the Riak databases and what use cases it's not the best solution to use?
Basho: Applications that require ad-hoc querying and heavy analytics tend to be less of a good fit for Riak. Since we are a key/value store at the core, applications that require ad-hoc queries and/or heavy analytic processing can be difficult to implement on top of Riak. Our main focus is predictability and scale, and there are some tradeoffs that have to be made with data model and queryability to stay faithful to this focus.
That said, we plan to enhance Riak in various capacities to address these use cases in 2012. Riak already exposes deeper query possibilities via our MapReduce, Secondary Indexing, and Search components, and we'll continue to make these more robust in future releases.
InfoQ: What are some best practices and gotchas that the application architects and developers should keep in mind when working on applications that access the data stored in Riak databases?
Basho: Riak runs reliably on both bare metal and cloud environments. Most clouds have relatively small i/o capacity to bare metal, so capacity planning should be stressed when deploying on something like AWS or Rackspace.
Modeling applications as keys and values can be difficult for architects who are used to the relational model. Spend a lot of time thinking about your data model and access patterns. Riak may not be a fit (and we're not afraid to tell you it's not). But if it is (which tends to be around 70% of use cases in our experience), you'll be delighted with that it offers. (One of our users once stated that, "Just about every application at scale becomes a key/value store.").
MapReduce is a powerful tool in Riak, but it's not meant to be run on all your data at the same time. MapReduce in Riak is meant to run over small key ranges and should be used to serve data for real time requests.
InfoQ: What is the current tool support for data modeling and application development using Riak database?
Basho: Aside from there being client support for virtually every major programming language, there are also various ORM libraries and frameworks for languages like Ruby, Python, PHP and Node.js. And there are various open source tools that let you inspect and tweak the data stored in Riak via a GUI. We'll be releasing more code to make this even easier in future releases of Riak.
InfoQ: What is the future road map of Riak database in terms of new features?
Basho: Moving forward you'll see a lot of work from Basho focusing on usability, core stability, and better support for globally distributed data storage. We will also continue to expand the query-ability and flexibility available to developers who are using Riak. Speed is also high on our list of priorities. We know Riak isn't the fastest database available, but it won't be that way forever.
Also, riak_core, the framework that powers Riak's distributed capabilities, and riak_pipe, the framework that powers Riak's MapReduce, will continue to be developed and more extensible.
Srini Penchikala currently works as Security Architect and has 17 yrs of experience in software product management.
Derek Collison discusses the goals, the design premises and patterns employed in creating the architecture of Cloud Foundry, VMware’s open source PaaS, unveiling internal architectural details.
Andrew Watson talks about the work of the OMG, where CORBA is alive and well (hint: in your car), UML and UML Profiles vs. custom Modeling languages, DDS and other middleware, and much more.
Sohil Shah discusses creating iPhone and Android enterprise mobile applications based on cloud services using the open source platform OpenMobster.
Paul Sanford presents the transformations supported by data throughout its life cycle, and how that can be better done with Splunk, an engine for monitoring and analyzing machine-generated data.
A common “best practice” for unit tests is to only write a one assertion in each test. I intend to question this advice by showing that multiple assertions per test are both necessary and beneficial.
John Rauser presents the architectural and technological evolution of Amazon retail websites starting with 1994 and ending with adopting Amazon Web Services.
Michael Stal discusses system architecture quality, how to avoid architectural erosion, how to deal with refactoring, and design principles for architecture evolution.
Every developer has had to integrate with another system, API or component. Tis article provides strategies to handle the change and for he separating system boundaries.
No comments
Watch Thread Reply