LiveRebel 1.0: No-Downtime Production Updates
ZeroTurnaround's LiveRebel 1.0 aims to alleviate downtime and lost sessions in server deployment automation. According to Jevgeni Kabanov, CTO of ZeroTurnaround, "most application updates are in the off hours with downtime. Those who do try the one-server-at-a-time thing aren't terribly happy. There is very little tooling support and the process is largely manual and partially scripted. InfoQ had an opportunity to speak with Zero Turnaround.
InfoQ: Can you provide any details on the complexity/size of customer deployments currently using LiveRebel?
Since we are talking about beta deployments, they are mostly smaller deployments, though some 100+ server production deployments are coming in the near future.
InfoQ: Is the focus on single node applications (WAR, EAR, JAR) or can liverebel handle more complex multi-node deployments?
We can handle all kinds of deployments, including clusters of unlimited size and even elastic cloud deployments.
InfoQ: Under what cases would LiveRebel be preferred over a more traditional methods available (i.e. upgrading a single server in a cluster at a time)?
Here are the conceptual reasons:
- Restarting servers one at a time takes a long time and for small changes gets expensive
- In case of any state structure changes in the application, session migration fails. Session drain can take forever if the app is actively used.
- In case of any change in database structure or remote APIs, the old and new version of the application may not be compatible and in that case cannot be ran in parallel.
InfoQ: What users can look forward to in the next release
LiveRebel 1.0 is very minimalistic. In the near future we will add:
In the long term we plan to address multiple critical deficiencies that we see in the current application life-cycle management offerings.
- a Hudson/Jenkins plugin,
- automatic and manual handling of state changes (e.g. an added field),
- database update integration as well as integration with some app lifecycle management products.
Features in the release include:
- A fully scriptable server and web console that can manage single-node, clustered or cloud Java EE applications of any size on any container.
- Versions each class and resource individually instead of reloading the whole application, avoiding the problems associated with container redeployment and rolling upgrades.
- Roll out updates instantly and opaquely to the users. Code is updated in-place, preserving all existing state.
- Uses an all-Java JVM plugin (-javaagent) on the nodes causing a 3-5% performance overhead.
However, there are some limitations. Although LiveRebel handles all changes to resources, it does not support:
- replacing the superclass
- implementing a new interface
- managing changes to JAR's that do not include the liverebel.xml file (most likely encountered when upgrading 3rd party JAR and supporting libraries)
Additionally, because LiveRebel cannot create new state, the following types of changes may have undesired side effects:
- Adding new instance field or renaming an existing one will cause existing objects to have it initialized to null
- Changing constructors will only have effect on objects created after the update
- Generally changes to various initializers will not take effect on existing objects
A recent survey that ZeroTurnaround conducted, confirmed the need for LiveRebel. It pointed out that server deployment automation is the exception rather than the rule (especially for 2-50 server range that made up the majority of the respondents) and that downtime and loss sessions is an acceptable practice, something that ZeroTurnaround wants to change : "Migrating user, application and database state in an environment
rich with fragile dependencies is the everyday reality of updating the java applications, which we want to change to the better."