Cloud Foundry: Design and Architecture
Derek Collison discusses the goals, the design premises and patterns employed in creating the architecture of Cloud Foundry, VMware’s open source PaaS, unveiling internal architectural details.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Jean-Jacques Dubray on Dec 01, 2010
One of the key achievements of the past decade has been both the wide availability of scalable systems for the masses, especially Cloud based systems, and the super-scalability of some Web applications. Facebook for instance serves 13 millions requests per second on average with peaks at 450 M/s. Even with such results, the concepts and architectures behind scalable systems are still rapidly evolving. About three years ago, Ricky Ho, a software architect based in California, had written blog post detailing the state of the art of scalable systems. Three years later, he felt that it was time to revisit the question.
Scalability is about reducing the adverse impact due to growth on performance, cost, maintainability and many other aspects
In his latest post, he lists patterns:
If Load Balancing, Result Cache and Map Reduce have been around for a while, some patterns are targeting new problems introduced by social medias. For instance the Bulk Synchronous Parallel, which was invented in the 80s, is now used as part of the Google Pregel Graph Processing project which supports three general processing patterns:
- Capture (e.g. When John is connected to Peter in a social network, a link is created between two Person nodes)
- Query (e.g. Find out all of John's friends of friends whose age is less than 30 and is married)
- Mining (e.g. Find out the most influential person in Silicon Valley)
Ricky introduces also the Execution Orchestrator pattern:
This model is based on an intelligent scheduler / orchestrator to schedule ready-to-run tasks (based on a dependency graph) across a clusters of dumb workers.
He reports that this pattern is used as part of the Microsoft Dryad project, which allows programmers "to use thousands of machines without knowing anything about concurrent programming".
A Dryad programmer writes several sequential programs and connects them using one-way channels. The computation is structured as a directed graph: programs are graph vertices, while the channels are graph edges. A Dryad job is a graph generator which can synthesize any directed acyclic graph. These graphs can even change during execution, in response to important events in the computation.
The kind of scalability that we routinely achieve today was unthinkable just 10 years ago. Where are the next limits? What is your experience in building scalable systems? What is missing?
Want to know how software releases can be stress-free and happen with one click? Try Go free!
Improving Software Delivery Cycles: Pre-requisites and Inhibitors
Big Data, Cloud & Mobile: Navigate the New Development Reality with Resources from IBM
Go: Agile Release Management Solutions. Go enables predictable, defect-free and timely software releases.
We at Migratory Data Systems (migratory.ro) have implemented last years a number of patterns and strategies to create an extremely scalable Comet server used to build real-time web applications.
For a Comet server the scalability is mainly defined by the number of concurrent users receiving real-time data and by the quantity of data published in real-time.
We achieved real-time data publication up to 1,000,000 concurrent users and scaled up to 1Gbps from a single instance of Migratory Push Server running on a small server, while the data latency (the delta between the time the data is created on the server side and the time the data received by the user) is very low (milliseconds). Here you can see detailed benchmark results:
migratory.ro/data/MigratoryPushServerBenchmarks...
This is an important progress comparing to a traditional web server that cannot handle more than a few thousands of concurrent users on a small server when streaming real-time data.
What are the next limits? It depends on the use case. If you distribute large data to many users you will reach 1Gbps limit even if the data overhead introduced by Migratory Push Server is very low (~20 bytes). If you run Migratory Push Server on a small server and distribute data to more than 1 million users you will eventually reach the memory limit. If you publish data with a frequency of 1 million messages per second with Migratory Push Server from a small server, you will eventually reach the CPU limit.
To scale further, another important feature concerning scalability is clustering. We've implemented clustering so that multiple instances of Migratory Push Server installed on multiple machines act as a single push server by offering more scalability but also adding load balancing and fault tolerance to the system.
Derek Collison discusses the goals, the design premises and patterns employed in creating the architecture of Cloud Foundry, VMware’s open source PaaS, unveiling internal architectural details.
Andrew Watson talks about the work of the OMG, where CORBA is alive and well (hint: in your car), UML and UML Profiles vs. custom Modeling languages, DDS and other middleware, and much more.
Sohil Shah discusses creating iPhone and Android enterprise mobile applications based on cloud services using the open source platform OpenMobster.
Paul Sanford presents the transformations supported by data throughout its life cycle, and how that can be better done with Splunk, an engine for monitoring and analyzing machine-generated data.
A common “best practice” for unit tests is to only write a one assertion in each test. I intend to question this advice by showing that multiple assertions per test are both necessary and beneficial.
John Rauser presents the architectural and technological evolution of Amazon retail websites starting with 1994 and ending with adopting Amazon Web Services.
Michael Stal discusses system architecture quality, how to avoid architectural erosion, how to deal with refactoring, and design principles for architecture evolution.
Every developer has had to integrate with another system, API or component. Tis article provides strategies to handle the change and for he separating system boundaries.
1 comment
Watch Thread Reply