New-age Transactional Systems - Not Your Grandpa's OLTP
John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.

Posted by Scott Delap on Jul 02, 2008
The IT industry makes heavy use of buzzwords and ever changing terms to define itself. Sometimes the latest nomenclature the industry uses is a particular technology such as x86 or a concept such as green computing. Terms rise and fall out of favor as the industry evolves. In recent years the term virtualization has become the industry’s newest buzzword. This raises the question … just what is virtualization? The first concept that comes to the mind of the average industry professional is running one or more guest operating systems on a host. However, digging a little deeper reveals this definition is too narrow. There are a large number of services, hardware, and software that can be “virtualized”. This article will take a look at these different types of virtualization along with the pros and cons of each.
Before discussing the different categories of virtualization in detail, it is useful to define the term in the abstract sense. Wikipedia uses the following definition: “In computing, virtualization is a broad term that refers to the abstraction of computer resources. Virtualization hides the physical characteristics of computing resources from their users, be they applications, or end users. This includes making a single physical resource (such as a server, an operating system, an application, or storage device) appear to function as multiple virtual resources; it can also include making multiple physical resources (such as storage devices or servers) appear as a single virtual resource...”
In layman’s terms virtualization is often:
The term is frequently used to convey one of these concepts in a variety of areas such as networking, storage, and hardware.
Virtualization is not a new concept. One of the early works in the field was a paper by Christopher Strachey entitled "Time Sharing in Large Fast Computers". IBM began exploring virtualization with its CP-40 and M44/44X research systems. These in turn lead to the commercial CP-67/CMS. The virtual machine concept kept users separated while simulating a full stand-alone computer for each.
In the 80’s and early 90’s the industry moved from leveraging singular mainframes to running collections of smaller and cheaper x86 servers. As a result the concept of virtualization become less prominent. That changed in 1999 with VMware’s introduction of VMware workstation. This was followed by VMware’s ESX Server, which runs on bare metal and does not require a host operating system.
Today the term virtualization is widely applied to a number of concepts including:
In most of these cases, either virtualizing one physical resource into many virtual resources or turning many physical resources into one virtual resource is occurring.
Server virtualization is the most active segment of the virtualization industry featuring established companies such as VMware, Microsoft, and Citrix. With server virtualization one physical machine is divided many virtual servers. At the core of such virtualization is the concept of a hypervisor (virtual machine monitor). A hypervisor is a thin software layer that intercepts operating system calls to hardware. Hypervisors typically provide a virtualized CPU and memory for the guests running on top of them. The term was first used in conjunction with the IBM CP-370.
Hypervisors are classified as one of two types:
Related to type 1 hypervisors is the concept of paravirtualization. Paravirtualization is a technique in which a software interface that is similar but not identical to the underlying hardware is presented. Operating systems must be ported to run on top of a paravirtualized hypervisor. Modified operating systems use the "hypercalls" supported by the paravirtualized hypervisor to interface directly with the hardware. The popular Xen project makes use of this type of virtualization. Starting with version 3.0 however Xen is also able to make use of the hardware assisted virtualization technologies of Intel (VT-x) and AMD (AMD-V). These extensions allow Xen to run unmodified operating systems such as Microsoft Windows.
Server virtualization has a large number of benefits for the companies making use of the technology. Among those frequently listed:
Correspondingly there are a number of potential downsides that must be considered:
Virtualization is not only a server domain technology. It is being put to a number of uses on the client side at both the desktop and application level. Such virtualization can be broken out into four categories:
Wikipedia defines application virtualization as follows:
Application virtualization is an umbrella term that describes software technologies that improve manageability and compatibility of legacy applications by encapsulating applications from the underlying operating system on which they are executed. A fully virtualized application is not installed in the traditional sense, although it is still executed as if it is. Application virtualization differs from operating system virtualization in that in the latter case, the whole operating system is virtualized rather than only specific applications.
With streamed and local application virtualization an application can be installed on demand as needed. If streaming is enabled then the portions of the application needed for startup are sent first optimizing startup time. Locally virtualized applications also frequently make use of virtual registries and file systems to maintain separation and cleanness from the user’s physical machine. Examples of local application virtualization solutions include Citrix Presentation Server and Microsoft SoftGrid. One could also include virtual appliances into this category such as those frequently distributed via VMware’s VMware Player.
Hosted application virtualization allows the user to access applications from their local computer that are physically running on a server somewhere else on the network. Technologies such as Microsoft’s RemoteApp allow for the user experience to be relatively seamless include the ability for the remote application to be a file handler for local file types.
Benefits of application virtualization include:
Disadvantages include:
Wikipedia defines desktop virtualization as:
Desktop virtualization (or Virtual Desktop Infrastructure) is a server-centric computing model that borrows from the traditional thin-client model but is designed to give administrators and end users the best of both worlds: the ability to host and centrally manage desktop virtual machines in the data center while giving end users a full PC desktop experience.
Hosted desktop virtualization is similar to hosted application virtualization, expanding the user experience to be the entire desktop. Commercial products include Microsoft’s Terminal Services, Citrix’s XenDesktop, and VMware’s VDI.
Benefits of desktop virtualization include most of those with application virtualization as well as:
Disadvantages of desktop virtualization are similar to server virtualization. There is also the added disadvantage that clients must have network connectivity to access their virtual desktops. This is problematic for offline work and also increases network demands at the office.
The final segment of client virtualization is local desktop virtualization. It could be said that this is where the recent resurgence of virtualization began with VMware’s introduction of VMware Workstation in the late 90’s. Today the market includes competitors such as Microsoft Virtual PC and Parallels Desktop. Local desktop virtualization has also played a key part in the increasing success of Apple’s move to Intel processors since products like VMware Fusion and Parallels allow easy access to Windows applications. Some the benefits of local desktop virtualization include:
Up to this point the types of virtualization covered have centered on applications or entire machines. These are not the only granularity levels that can be virtualized however. Other computing concepts also lend themselves to being software virtualized as well. Network virtualization is one such concept. Wikipedia defines network virtualization as:
In computing, network virtualization is the process of combining hardware and software network resources and network functionality into a single, software-based administrative entity, a virtual network. Network virtualization involves platform virtualization, often combined with resource virtualization. Network virtualization is categorized as either external, combining many networks, or parts of networks, into a virtual unit, or internal, providing network-like functionality to the software containers on a single system…
Using the internal definition of the term, desktop and server virtualization solutions provide networking access between both the host and guest as well as between many guests. On the server side virtual switches are gaining acceptance as a part of the virtualization stack. The external definition of network virtualization is probably the more used version of the term however. Virtual Private Networks (VPNs) have been a common component of the network administrators’ toolbox for years with most companies allowing VPN use. Virtual LANs (VLANs) are another commonly used network virtualization concept. With network advances such as 10 gigabit Ethernet, networks no long need to be structured purely along geographical lines. Companies with products in the space include Cisco and 3Leaf.
In general benefits of network virtualization include:
Similar to server virtualization, network virtualization can bring increased complexity, some performance overhead, and the need for administrators to have a larger skill set.
Another computing concept that is frequently virtualized is storage. Unlike the definitions we have seen up to this point that have been complex at times, Wikipedia defines storage virtualization simply as:
Storage virtualization refers to the process of abstracting logical storage from physical storage.
While RAID at the basic level provides this functionality, the term storage virtualization typically includes additional concepts such as data migration and caching. Storage virtualization is hard to define in a fixed manner due to the variety of ways that the functionality can be provided. Typically, it is provided as a feature of:
Each vendor has a different approach in this regard. Another primary way that storage virtualization is classified is whether it is in-band or out-of-band. In-band (often called symmetric) virtualization sits between the host and the storage device allowing caching. Out-of-band (often called asymmetric) virtualization makes use of special host based device drivers that first lookup the meta data (indicating where a file resides) and then allows the host to directly retrieve the file from the storage location. Caching at the virtualization level is not possible with this approach.
General benefits of storage virtualization include:
Some of the disadvantages include:
Enterprise application providers have also taken note of the benefits of virtualization an begun offering solutions that allow the virtualization of commonly used applications such as Apache as well as application fabric platforms that allow software to easily be developed with virtualization capabilities from the ground up.
Application infrastructure virtualization (sometimes referred to as application fabrics) unbundle an application from a physical OS and hardware. Application developers can then write to a virtualization layer. The fabric can then handle features such as deployment and scaling. In essence this process is the evolution of grid computing into a fabric form that provides virtualization level features. Companies such as Appistry and DataSynapse provides features including:
IBM has also embraced the virtualization concept at the application infrastructure level with the rebranding and continued of enhancement of Websphere XD as Websphere Virtual Enterprise. The product provides features such as service level management, performance monitoring, and fault tolerance. The software runs on a variety of Windows, Unix, and Linux based operating systems and works with popular application servers such as WebSphere, Apache, BEA, JBoss, and PHP application servers. This lets administrators deploy and move application servers at a virtualization layer level instead of at the physical machine level.
In summary it should now be apparent that virtualization is not just a server-based concept. The technique can be applied across a broad range of computing including the virtualization of:
The technology is evolving in a number of different ways but the central themes revolve around increased stability in existing areas and accelerating adoption by segments of the industry that have yet to embrace virtualization. The recent entry of Microsoft into the bare-metal hypervisor space with Hyper-V is a sign of the technology’s maturity in the industry.
Beyond these core elements the future of virtualization is still being written. A central dividing line is feature or product. For some companies such as RedHat and many of the storage vendors, virtualization is being pushed as a feature to complement their existing offerings. Other companies such as VMware have built entire businesses with virtualization as product. InfoQ will continue to cover the technology and companies involved as the space evolves.
If an IT manager was not hip to virtualization, I'd send her or him to this article before reading anything else. Very nice.
Really interesting and a basic stuff for one who wants to understand virtualization
Cloud computing is also an area of virtualization which is gaining a lot of popularity. Several key players such as Amazon, Google & Microsoft have entered into this field, which some experts predict to be the next major change or paradigm shift in the industry.
Afkham Azeez
Besides the software mentioned in the article, I would recommend everyone check into the open source VirtualBox application.
Thanks for the Introduction. Here is another one I like. Very simple and to the point
thetoptenme.wordpress.com/2008/07/17/virtualiza...
John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.
Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.
Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.
Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).
Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.
Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.
One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.
InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.
5 comments
Watch Thread Reply