Cloud Foundry: Design and Architecture
Derek Collison discusses the goals, the design premises and patterns employed in creating the architecture of Cloud Foundry, VMware’s open source PaaS, unveiling internal architectural details.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by David West on Mar 11, 2009
A few years ago, Eric Lippert noted that an optimized and unoptimized build of the same source code could result in different potential deadlocks. That problem was "fixed" in the 4.0 release of C#. 'Fixed' is in quotation marks because the cure is not without its own problems.
The original problem arose from the possibility of the compiler inserting no-op instructions when translating IL to machine code in an inconsistent manner; depending on the way you selected to turn on/off optimization and debugging. Lippert noted:
Recall that lock(obj){body} was a syntactic sugar forvar temp = obj;
Monitor.Enter(temp);
try { body }
finally { Monitor.Exit(temp); }The problem here is that if the compiler generates a no-op instruction between the monitor enter and the try-protected region then it is possible for the runtime to throw a thread abort exception after the monitor enter but before the try. In that scenario, the finally never runs so the lock leaks, probably eventually deadlocking the program. It would be nice if this were impossible in unoptimized and optimized builds.
The solution however has it own issues. According to Eric, "It is consistently bad now, which is better than being inconsistently bad. But there's an enormous problem... implicit in this codegen is the belief that a deadlocked program is the worst thing that can happen. That's not necessarily true.
The purpose of a lock is to protect a mutable resource, or stated another way to protect potentially multiple users of that mutable resource from accessing a corrupt version of the resource. The current 4.0 solution does not include rollback to original state or guarantee of completion of a mutation. It is possible for an exception to occur that forces a branch to the finally clause of the lock statement, releasing the lock and allowing access to any waiting thread to the corrupted resource. The solution made a tradeoff of consistency of results, and reducing the possibility of deadlocks, at the cost of potentially accessing corrupt state. This problem is particularly risky in multi-threaded programming.
This specific tradeoff involves a choice between two bad results: deadlock the program or fail to protect the state of a critical resource. This specific example is but one of several design decisions and tradeoffs that we are forced to make when doing multi-threaded programming.
A number of developers responded by noting that this kind of design problem is not unique to multi-threading and that there is a difference between "lock safety" and "exception safety." Lippert responded by agreeing that multi-threading only makes a hard problem harder, and that "getting the locks right is only the first step," your design still needs to deal with all other kinds of exceptions and how they should be handled. A large number of respondents made points about the danger of aborting threads and agreeing, for the most part, with Lippert that, "aborting a thread is pure evil."
Visual Studio vNext: ALM features for Agile Planning, Team Collaboration
Troubleshoot Java/.NET performance while getting full visibility in production
RDBMS to NoSQL: Managing the Transition
Derek Collison discusses the goals, the design premises and patterns employed in creating the architecture of Cloud Foundry, VMware’s open source PaaS, unveiling internal architectural details.
Andrew Watson talks about the work of the OMG, where CORBA is alive and well (hint: in your car), UML and UML Profiles vs. custom Modeling languages, DDS and other middleware, and much more.
Sohil Shah discusses creating iPhone and Android enterprise mobile applications based on cloud services using the open source platform OpenMobster.
Paul Sanford presents the transformations supported by data throughout its life cycle, and how that can be better done with Splunk, an engine for monitoring and analyzing machine-generated data.
A common “best practice” for unit tests is to only write a one assertion in each test. I intend to question this advice by showing that multiple assertions per test are both necessary and beneficial.
John Rauser presents the architectural and technological evolution of Amazon retail websites starting with 1994 and ending with adopting Amazon Web Services.
Michael Stal discusses system architecture quality, how to avoid architectural erosion, how to deal with refactoring, and design principles for architecture evolution.
Every developer has had to integrate with another system, API or component. Tis article provides strategies to handle the change and for he separating system boundaries.
2 comments
Watch Thread Reply