10 tips on how to prevent business value risk
One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Chris Sims on Mar 09, 2009
The sooner that a feature gets into production, the sooner it starts adding value. The quicker a system can change in response to user feedback, the easier it is to keep the users happy. Timothy Fitz and Joe Ludwig have recently published articles that describe practical implementations of continuous deployment, a process that reduces the release cycle from weeks to minutes.
Timothy's first article examined the impact that continuous deployment could have on the cost of fixing bugs. The more time between when an error gets introduced into a system, and when it is found, the more difficult and expensive the error will be to fix. If the engineer sees the mistake right after they type it, the cost of the bug is essentially zero. If the compiler catches the bug, the cost, in terms of developer time, is likely measured in minutes. If the bug gets deployed into production, but goes unnoticed for some time, the cost to find and fix the error can be astounding. The industry saw a dramatic example of this with Y2K. Timothy's position is that it is better to fail fast, so that the impact and cost of bugs can be minimized.
The comments posted by readers indicated significant skepticism about the practicality of continuous deployment. Erik A. Brandstadmoen put it bluntly: "In real life, I don’t think [your] approach is good enough." A commenter on ycombinator said: "ah... no. Maybe this is just viable for a single developer, as a substitution for continuous integration. But with multiple developers checking in, on a complex system your site will be down. A lot."
In response to the skeptics, Timothy wrote about how IMVU continuously deploys their system. The process starts with continuous integration to quickly build and test new changes. One of the keys is extensive, and extremely reliable, automated tests. They employ a farm of test machines to keep the runtime for the entire test suite under 10 minutes. Once all of the tests have passed, deployment begins.
The code is rsync’d out to the hundreds of machines in our cluster. Load average, cpu usage, php errors and dies and more are sampled by the push script, as a basis line. A symlink is switched on a small subset of the machines throwing the code live to its first few customers. A minute later the push script again samples data across the cluster and if there has been a statistically significant regression then the revision is automatically rolled back. If not, then it gets pushed to 100% of the cluster and monitored in the same way for another five minutes. The code is now live and fully pushed.
With 60 employees, 30 million registered users and over a million dollars a month in revenue, what IMVU has created is certainly not trivial. Based on an examination by Michael Bolton and James Bach, the system is also not perfect. Elisabeth Hendrickson put this in context by pointing out that perfection is likely not the goal of the system.
Joe Ludwig, a former architect of Pirates of the Burning Sea, wrote two articles examining what it would really take to do continuous deployment in an environment with heavyweight client code. He starts with a description of the seven and half hour deploy process for 'Pirates' and outlines what it would take to reduce that to one hour. In his second article, he describes some of the important technical changes that would be required to make the one-hour deploy a reality.
What is your experience with continuous deployment? What have to change about the systems you work with, in order to make them continuously deployable? Leave a comment and share.
Agility at scale, become as agile as you can be
SCM best practices for multiple processes, releases & distributed teams
In today’s hyper-competitive world, later may be too late to adopt Agile development and this Roadmap for Success will help you get started. Download "Agile Development: A Manager's Roadmap for Success" now!
...even though on a quite smaller scale and brought it to the first step - continuos deployment on test machines. But I knew (and now that I read this I'm sure) it would work. An excellent case study - and great stuff!
The last two projects that I have worked on, we had continuous deployment of a sorts to our test boxes. The first project we were able to hook up a build in CruiseControl.net that allowed anyone working on the project to do a deploy with a single click by issuing a build.
The second project, they had built a Rails application that allowed anyone to deploy any part of the application to any of our 9 test boxes. The Rails app would then give them the progress of the deployment and whether it was successful or not.
However, neither project could be automatically deployed to production, but there was nothing technical that stopped this from happening either.
With a combination of DBDeploy, Capistrano, and other utilities that allow you to manage all aspects of a project, there is nothing stopping any shop from doing either continuous or automatic deployments. And once you have automatic deployments, how far behind is continuous deployment?
If systems are designed with this requirement in mind, continuous deployment can be easily solved. The Maven+Hudson combo goes a long way in providing these capabilities. One sticky are we've experienced is service level integration, for the cases when a deployment consists of an entire stack, including App, backend services (with inter-dependencies), etc. We ended up doing some clunky service level integration steps, but they didn't feel very good :P. As the author pointed out for IMVU, it's critical that the deployment can be rolled out slowly, and undone if it fails. I see too often teams strive for 100% perfect deployments. Things will go wrong, there will be bugs, the continuous test suite won't catch everything. In my opinion, the ideal solution strikes a balance between time to deliver features, and time between bug injection and bug detection.
-Evan
We've also had great luck with CruiseControl.net, and a home grown equivalent to DBDeploy.
(every data object script was responsible for upgrading itself, and included a list of objects that needed to be managed before it can update itself)
Now we're in a hibernate environment, and we've been reduced to manually diffing schema, and writing scripts from that. (This is then delegated to the DBA team)
Are there tools to help automate the database changes hibernate driven environements?
A typo might have some substantial effect on a software system. Its side effects might not be undone or rolled back easily. In rare cases it may even ruin a system. Also a typo might as well effect other "unrelated" parts of a system unexpectedly, not the expected parts. Especially in large systems this might be uneasy to discover.. And if you are doing continuous deployment on a large scale project, then this might mean "many" unrelated fixes deployed at once, and you should not forget that many of the problems will not be discovered instantly after deployment. Thus the isolated deployment principle won't work.
So the example given in the example isn't true even if you do extensive automated testing. I don't object all the idea of continuous deployment, but the only example in the article is misleading.
If you interested in more continuous deployment stories, challenges and successes, please take a look at this blog: ciadvantage.com/cs/blogs/tim_bassett/default.aspx
One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.
InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.
Alex Papadimoulis discusses ugly code, where it comes from, how to avoid it, and how to get rid of it.
John Davies examines Visa’s architecture and shows how enterprises have architected complex integrations incorporating Hadoop, memcached, Ruby on Rails, and others to deliver innovative solutions.
Sean Comerford unveils ESPN.com’s architecture, what components are used and why, and the current changes the website goes through.
Are there repeated patterns of failure on Enterprise Agile Enablement efforts? Sanjiv and Arlen discuss Seven Deadly Sins to avoid when adopting Agile in an enterprise.
Erik Dörnenburg answers: What is Enterprise and Evolutionary Architecture?, discussing 4 issues: Turning strategy into execution, Ensuring conformance, Where do the architects sit? Buying or building?
Sean Cribbs explains what Map-Reduce and Riak are, why and how to use Map-Reduce with Riak, and how to convert SQL queries into their Map-Reduce equivalents.
6 comments
Watch Thread Reply