Cloud Foundry: Design and Architecture
Derek Collison discusses the goals, the design premises and patterns employed in creating the architecture of Cloud Foundry, VMware’s open source PaaS, unveiling internal architectural details.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Jonathan Allen on Apr 25, 2011
Since we last reported on TPL Dataflow more information has come to light. First and foremost is the goals of the project have been clarified. At a high level the primary purpose is to support high performance producer/consumer scenarios using asynchronous programming techniques. These can be used alone or in conjunction with the task, query, and loop-based parallelism offered by the TPL libraries.
Just as important is what this is not. According to Stephen Toub, this is not a direct replacement for full actor/agent style languages and libraries such as Erlang. TPL Dataflow is just a set of building blocks; developers still have to design the infrastructure around it. (There is a separate research project called Axum that is attempting to fill that role.)
At first glance TPL Dataflow may seem to overlap with Reactive Extensions. While they both involve moving data around, Reactive Extensions focuses on the ability to write complicated push-based data streams in a very succinct fashion. TPL Dataflow is more about being the fundamental building blocks for building up actors and agents, with an emphasis on controlling aspects such as where to do buffering and when to block producers.
These libraries are meant to be complementary, with the April CTP of TPL Dataflow offering direct integration that adds “the ability to expose dataflow sources as observables and dataflow targets as observers.” The intention is that developers can seamlessly move messages back and forth between dataflow and reactive code as necessary. It should be noted that it is possible to do everything in one or the other, but each offers some useful capabilities that would otherwise have to be built up from scratch.
Stephen also mentioned that a lot of TPL Dataflow has an analogous framework on the native-side called Asynchronous Agents Library. Aside from slightly different naming conventions, TLP Dataflow tends to offer a richer API than its unmanaged counterpart. For example there is built-in support for telling a block that it won’t be receiving any more data and it can shut itself down. TPL Dataflow also has the advantage in that the C# and VB languages are being modified to better support it, something that isn’t feasible for C++.
Due to customer feedback, a major emphasis with TPL Dataflow is the reducing the number of object allocations needed for processing messages. While actually allocating memory is very cheap in .NET, creating too many objects can result in a significant garbage collection cost down the road. Some strategies such as reusing active tasks have been supported all along. With the newest CTP further enhancements such as replacing the DataflowMessage<T> class with the DataflowMessageHeader struct. Another improvement is making the cloning function of WriteOnceBlock<T> and BroadcastBlock<T> optional, allowing more efficient use of immutable messages.
TPL Dataflow can be downloaded as part of the Visual Studio Async CTP. There is no timeline for its release, but the heavy reliance on the new syntax from VB 11 and C# 5 imply that it will be shipped when those are.
Automating Error Reporting for .NET Applications
Visual Studio vNext: ALM features for Agile Planning, Team Collaboration
Troubleshoot Java/.NET performance while getting full visibility in production
Derek Collison discusses the goals, the design premises and patterns employed in creating the architecture of Cloud Foundry, VMware’s open source PaaS, unveiling internal architectural details.
Andrew Watson talks about the work of the OMG, where CORBA is alive and well (hint: in your car), UML and UML Profiles vs. custom Modeling languages, DDS and other middleware, and much more.
Sohil Shah discusses creating iPhone and Android enterprise mobile applications based on cloud services using the open source platform OpenMobster.
Paul Sanford presents the transformations supported by data throughout its life cycle, and how that can be better done with Splunk, an engine for monitoring and analyzing machine-generated data.
A common “best practice” for unit tests is to only write a one assertion in each test. I intend to question this advice by showing that multiple assertions per test are both necessary and beneficial.
John Rauser presents the architectural and technological evolution of Amazon retail websites starting with 1994 and ending with adopting Amazon Web Services.
Michael Stal discusses system architecture quality, how to avoid architectural erosion, how to deal with refactoring, and design principles for architecture evolution.
Every developer has had to integrate with another system, API or component. Tis article provides strategies to handle the change and for he separating system boundaries.
No comments
Watch Thread Reply