Cloud Foundry: Design and Architecture
Derek Collison discusses the goals, the design premises and patterns employed in creating the architecture of Cloud Foundry, VMware’s open source PaaS, unveiling internal architectural details.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Jean-Jacques Dubray on Dec 01, 2011
HPCC (High Performance Computing Cluster) is an open source massively parallel-processing computing platform that solves Big Data problems. This week, they announced that they made their Data Delivery Engine available on EC2. It's architecture is composed of a Data Refinery Cluster (Thor) and a Query Cluster (ROXIE):
The HPCC system architecture incorporates the Thor and Roxie clusters as well as common middleware components, an external communications layer, client interfaces which provide both end-user services and system management tools, and auxiliary components to support monitoring and to facilitate loading and storing of filesystem data from external sources.

Thor is responsible for consuming vast amounts of data, transforming, linking and indexing that data. It functions as a distributed file system with parallel processing power spread across the nodes. A cluster can scale from a single node to thousands of nodes. Roxie (the Query Cluster) provides separate high-performance online query processing and data warehouse capabilities.
Both Thor and Roxie are based on a parallel processing programming language (ECL- Enterprise Control Language) optimized for extraction, transformation, loading, sorting, indexing and linking. It is "implicitly parallel", non-procedural and dataflow oriented. It combines data representation and algorithm implementation and can easily be extended with C++ libraries.
HPCC also provides an ESP, (Enterprise Services Platform) which exposes XML, HTTP, SOAP and REST interfaces to ECL queries. The access model is based on SAML.
HPCC sees several key differentiators with Hadoop:
Big Data solutions continue to evolve at a rapid pace fuelled by their advances, scalability and ultimately the desire to process and query very large amounts of data. Even NPR is talking about Big Data ! Did you participate in a big data project? What's is your take on it? Where do you see Big Data going from here?
Big Data, Cloud & Mobile: Navigate the New Development Reality with Resources from IBM
RDBMS to NoSQL: Managing the Transition
Tutorial: Integrating SQLFire with tc Server and Spring Data
Derek Collison discusses the goals, the design premises and patterns employed in creating the architecture of Cloud Foundry, VMware’s open source PaaS, unveiling internal architectural details.
Andrew Watson talks about the work of the OMG, where CORBA is alive and well (hint: in your car), UML and UML Profiles vs. custom Modeling languages, DDS and other middleware, and much more.
Sohil Shah discusses creating iPhone and Android enterprise mobile applications based on cloud services using the open source platform OpenMobster.
Paul Sanford presents the transformations supported by data throughout its life cycle, and how that can be better done with Splunk, an engine for monitoring and analyzing machine-generated data.
A common “best practice” for unit tests is to only write a one assertion in each test. I intend to question this advice by showing that multiple assertions per test are both necessary and beneficial.
John Rauser presents the architectural and technological evolution of Amazon retail websites starting with 1994 and ending with adopting Amazon Web Services.
Michael Stal discusses system architecture quality, how to avoid architectural erosion, how to deal with refactoring, and design principles for architecture evolution.
Every developer has had to integrate with another system, API or component. Tis article provides strategies to handle the change and for he separating system boundaries.
No comments
Watch Thread Reply