More Than Just Spin (Up) : Virtualization for the Enterprise and SaaS
Cloud services, such as Amazon EC2, have helped bring virtualization to the forefront of the IT conversation. These services have been built on the the most popular feature of virtualization, the rapid provisioning of new virtual machines on available hardware. This is the core premise of cloud computing - massive amounts of infrastructure with excessive capacity sold to those organizations who require it. It can provide the flexibility and structure of a massive infrastructure by selling customers the individual slices required for their smaller systems.
While popular, virtualization offers many benefits beyond the ability to provision a new (virtual) server almost instantaneously on the available excess capacity of providers. The full value of virtualization is in the more complex benefits, such as high availability, disaster recovery, and rapid application provisioning. Simply put, the hypervisor has transformed itself into a commodity. Almost every virtualization provider has demonstrated this realization. The Xen hypervisor was developed in the free and open source community. As recently as this month, VMware made ESXi, their enterprise-class hypervisor with minimal footprint, available as a free download. Yet, the complex features of both product lines are still reserved for the sales channel.
This article will examine some of these complex benefits and their application in the real world. More importantly, it provides some details into how Contegix is implementing virtualization to solve complex problems and when it should not be used.
The Enterprise : Powered by Virtualization
Past the point of rapid provisioning, there are numerous features that drive the value of virtualization for the enterprise. The most important are maximum utilization of resources, high availability for applications, and business continuity in the face of disaster. These topics are on the forefront of requirements for every CIO/CTO. The true power of virtualization is in these features.
Cloud computing is not yet the prevalent technology deployment platform for every company. There exists the possibility that this will never be the case. Many organizations are under strict governance that prevent this. Compliance concerns and lack of trust for cloud providers are limiting factors. Laws, such as the Health Insurance Portability and Accountability Act (HIPAA) and Sarbanes-Oxeley (SOX), often leave the interpretation of these laws and potential negative impact of non-compliance as liabilities. Until these are confidently addressed, the enterprise will have a difficult moving everything to the cloud.
Yet, that does not mean these organizations should be absent from the benefits of the flexibility in scale by fully utilizing all computing resources. Virtualization on dedicated computing resources with dedicated network infrastructure is at the heart of this. In essence, these companies are building and utilizing private clouds (or more accurately private cloud-style architectures).
VMWare and Citrix XenServer have the capability to maximize the complete resources of the available physical hardware. They can dynamically balance the resource requirements of the virtual machines with the capacity of the physical servers in a virtualization cluster. When a virtual machine is started, the virtualization stack allocates it to a physical server with the resources.
Most virtualization stacks incorporate features to drive high availability of applications. This does not completely supplement clustering technologies, such as Oracle Coherence and Microsoft Cluster Server. These are designed to address failures that occur at the hardware layer and to facilitate high availability for applications that do not have these capabilities.
High availability for these applications is typically delivered using multiple nodes in an active-passive setup. This works by send any request for the service to the primary-active node. If the primary-active node in unresponsive, a secondary node is promoted to primary-active and the request is sent there. A problem still exists – any in-memory data held by the original primary-active node is no longer available.
This problem is solved with live migration capabilities, such as VMWare’s VMotion and Citrix XenServer XenMotion. These allow the migrations of a running virtual machine from one underlying physical machine to another without loss of data whether on-disk or in-memory and without loss of network connectivity to clients. This is possible due to the replication of the virtual machine’s entire memory state and precise execution state from one physical to another. The configurations and overall state of the virtual machine is stored on shared storage space.
If the physical server containing the running virtual machine encounters an outage, the virtualization stack detects the outage and activates the replicated virtual machine on the second physical hardware. The migration of a virtual machine preserves the core states of a running machine – the precise execution state, the network identity, and the active network connections. This enables the migration with zero downtime and no negative impact to users.
Combining the features of resource scheduling, live migrations of virtual machines, and replicated enterprise storage systems provide business continuity capabilities. Virtual machines can be replicated from one virtual cluster to another almost regardless of distance and independent of underlying hardware.
SaaS : Powered by Virtualization
Despite the high profile nature of Software-as-a-Service (SaaS), the market has yet to fully matured. There are a number of factors contributing to its current state. The most prevalent is the fact that infrastructure is disproportionally a large percentage of startup costs with the exception of development. Yet revenue is driven over a multi-year period. Hence, the first customer is very costly to spin up, but each additional customer should have a much smaller cost. SaaS deployments also demand creating a level of consistency while remaining flexible enough to support customer level modifications.
The typical SaaS customer does not care about infrastructure costs and concerns. The typical SaaS sale more than likely fits on the corporate American Express while an internal corporate software purchase requires multiple C-Level executive approvals. Therefore SaaS vendors must stay within reasonable price points that run counter to quickly offsetting initial infrastructure costs. Moving beyond just the purchase of hardware equipment, infrastructure also includes the long-term ongoing costs of deployment, support, management, and maintenance of delivering the application.
Focusing beyond infrastructure and on deployment a SaaS application platform should be centered on reproducibility. Every instance of the SaaS-delivered application needs to be nearly identical to every other. Differences need to be minimized in order deliver consistent behavior of application instances per customer and for support to have an identical base from which to troubleshoot. No support engineer wants to discover the problem was caused by a missing Apache module for a single customer instance No customer wants to know that each instance of the application ordered has a chance of problem because the SaaS company can not reproduce the same exact steps for every order. As a final addition to complexity, the entire process needs to be automated for consistency and cost reasons.
(There are notable exceptions, such as the application data, deployment instance data, and the potential parameters for scalability; however, these are merely meta and user data for the application. If one were to re-provision the application for a different customer, this is the data that would mostly be cleared out to start with a fresh slate.)
So why is such consistency a problem? It is important to understand the complexity of deployment for today's applications - SaaS or traditional. Even in the simplest of web applications, it is no longer the applications responsibility to manage the underlying data store layer. This is delegated to a database, typically a RDBMS such as MySQL, PostgreSQL, Oracle, or SQL Server. Combined with typical web stacks such as Java, Rails, etc, this leads to a multi-tiered architecture demanding scalable deployment. For example, a Rails application may require Apache, Mongrel cluster, memcache, and MySQL.
The initial installation and wiring together of the application infrastructure is not the sole deployment problem to address. Application components often require different resources. It is often beneficial to dedicate specific resources to various components to ensure its availability, consistent behavior, and prevent starvation. For example, a Mongrel cluster or Tomcat instance may be assigned a defined amount of CPU and RAM while the dependent database has separated requirements. This would help prevent one service from starving another.
Furthermore, the agile nature of applications allow for easy extension, often via plugins, macros, and mashups. An extension developed for a smaller instance may not scale to the levels of every organization. When a problem occurs within the application stack of a customer (or group of customer sharing the application infrastructure stack), the goal is to fence off the offending systems to minimize the potential impact to others. A SaaS customer does not want to hear his problem occurred due to resource constraint circumstances created by another customer.
There are a few ways to accomplish these requirements. In regards to infrastructure, virtualization technology can be used to address the hardware problem from a cloud computing standpoint allowing incremental scale. At the same time it can prove useful in managing deployment, support and maintenance. On the physical hardware side of deployment consistency, it is possible to use tools like Cfengine and Puppet. Resource constraints can be applied using operating specific features, such as Solaris zones and Linux PAM via /etc/security/limits.conf. These are absolutely fine tools. Yet, virtualization provides a better way to tackle these and comes with numerous intrinsic benefits. Virtualization makes it possible to implement a core concept in computer science - separation of concerns.
Separation of concerns is the premise to breaking down an application into "distinct features that overlap in functionality as little as possible." (Cite Wikipedia). With virtualization, this concept can be applied to the infrastructure. Separation can be applied down to the per-application, per-customer, and/or per-cluster basis. Thus, it provides the capability to scale horizontally and vertically while still fully utilizing the underlying hardware to its utmost capacity. This is especially beneficial for single tenant applications wishing to enter the SaaS market. Instant multi-tenancy on the underlying hardware with near zero code change.
There are two common deployment models deployed on Contegix's SaaS platform. The differentiating factor is often around how the application was developed - to support one single customer per deployment or to support multiple customers on a single deployment (single-tenancy vs. multi-tenancy).
In the single customer SaaS delivery, a limited number of virtual machines are utilized to deliver the customer application instance. Often, this is scoped to a mere one or two virtual machine. For example, one virtual machine may deliver the entire application from web tier through database tier or these may be separated into two different machines, each with scaling at the respective levels. Using the virtualization capability of resource allocation, additional CPU cores can quickly be added to the PostgreSQL virtual machine. Dynamic Resource Allocation can quickly move the virtual machine to a physical machine with the resources needed.
The other common deployment model is to provider a higher degree of separation. The underlying infrastructure applications are separated into virtual machines, each scaled at the respective required levels. For example, an application comprised of a PostgreSQL database, two Tomcat containers for different Java web applications, and a Mongrel cluster for a Rails web application may be configured with a each component having separate virtual machines. Beyond the single-tenant example, this allows scaling at not just the individual components in terms of virtual machine resources and number of instances. All of the web tiers wrapped via a proxy by nginx (or Apache) to provide a seamless transition with no front end customer-facing changes. File-level data access to the multiple machines is delivered using a network shares or block-level devices via clustered file systems. Once again, this model serves very well for large instances or multiple customer applications.
Regardless of the deployment model, it is critical to separate the operating system and application installation from the application data. This lends the conversation to how upgrades are processed and handled. The operating system and application installation should be considered volatile data, capable of being replaced at anytime with a refreshed copy or new version. Upgrades are processed by exactly this method. The new version of these partitions is overlaid with the operating system, application(s), and configurations updated. This further illustrates that while the deliverable is an application, the delivery mechanism is much greater.
There are intrinsic benefits by utilizing virtualization as the power for delivering applications. The aforementioned virtualization feature of moving the virtual machines from one physical machine to another is a key benefit. With the physical hardware solutions, such as PAM, this would require moving the user data and synchronizing the configuration files at the host operating system level. In addition, it allows the ability to quickly build, develop, and test virtual machines as sandboxes and run comparisons of them using standard tools. Imagine running the Unix command "diff -r" or and MF5 checksum across two different images.
Where Virtualization Does Not Meet Needs
As with any new buzzword applied concept, there is the tendency to see virtualization as the answer as to all problems – new and old. The reality is that virtualization does not work for every system and application. The resources required are too high for some applications. This is most common in high I/O environments, such as large-scale database servers and intense UDP networks.
To put it simply, test, test, test.
About the author:
Matthew E. Porter is the CEO for Contegix LLC. Before starting Contegix, Matthew was a partner in Contegix's parent company, Metissian LLC. Matthew graduated from St. Louis University with a B.S. in Computer Science.
Excellent article (and hosting from Contegix)
Ronny Kohavi Dec 12, 2013
Christian Legnitto Dec 12, 2013