InfoQ

News

Database Virtualization - Is it worth it?

Posted by Jonathan Allen on Feb 12, 2008 12:13 PM

Community
.NET,
Architecture
Topics
Data Warehousing ,
Virtualization
Tags
SQL Server 2005 ,
SQL Server 2008

Hosting server applications inside VM images is all the rage today. The ability to quickly move a virtual server from one machine to another as needs change is a big win for IT departments. But can this be applied to heavyweight systems like SQL Server? Conor Cunningham says no.

According to Conor, SQL Server makes several assumptions about its environment. These include:

  1. All CPUs are equally powerful
  2. All CPUs process instructions at about the same rate.
  3. A flush to disk should probably happen in a bounded amount of time.

The first issue comes into play with high-end SKUs that support parallel queries. When a query is executed, the work is evenly spilt among threads. But with both hyperthreading and virtualization, those threads are not running at a consistent speed.

So now I have some threads that finish earlier than others. So they block until the slowest threads finish. Even worse, I don't think that the query re-allocates those threads for other queries until the whole query finishes. So, now you have some background as to why hyperthreading was not recommended for at least some SQL Server deployments.

Later Conor discusses memory and I/O,

SQL Server assumes, at least in the main server SKUs, that it is the only significant memory consumer on the machine. It's a *server*. (SQLExpress has different assumptions, but it's no memory slouch either). Now, SQL Server will work in a memory constrained environment, but you often don't want to do that. You take that away from a lot of different things - the buffer pool, the compiled plan cache, memory to execute queries (for example, hash join grants). All of these things can add up if you aren't careful.

I/O is the area where I have the least experience in virtualization. This is one of the reasons I asked people about production SQL Servers. Usually they did get some storage array, and this makes sense - it ramps the I/O bandwidth and usually isolates it from any other operations on the machine (your OS, your application you are developing on top of SQL Server, etc). I'm going to spend some more time on this, but I think the core idea is sound - as you start sharing your I/O bandwidth over several VMs, you are going to hit limits earlier with big IO consumers like SQL Server. The same basic logic applies - isolate your database traffic onto different storage paths, especially when building a system to scale. In a VM world, this can let you avoid the sharing penalties vs. the default config of everyone sharing the same hard drive.

All this isn't to say that SQL Server cannot run in a virtual image, merely that if performance is critical than it probably isn't work the cost.

3 comments

Reply

Seems obvious by Evan K Posted Feb 12, 2008 4:15 PM
It's not a question of virtualization or not by Alexandre de Pellegrin Posted Feb 12, 2008 5:23 PM
No more debate than anything else by Jim Leonardo Posted Feb 13, 2008 1:33 PM
  1. Back to top

    Seems obvious

    Feb 12, 2008 4:15 PM by Evan K

    I recently virtualized SQL Server on Xen HVM and it worked well. Right now the biggest issue is the maturity of the open source paravirtualized IO drivers in Windows (they are making rapid improvements). Enterprise Xen has some proprietary ones.

    Nobody said it would be free; of course when you have more scheduling contention (or gosh, less RAM allocated) there is some performance degradation.

  2. Back to top

    It's not a question of virtualization or not

    Feb 12, 2008 5:23 PM by Alexandre de Pellegrin

    It's always a question of quality of services. The first question is : what do I want? the fastest database engine, the most secure database engine, or something that could be maintained easily? Never forget to tell you what do you want to do with your database and what should be its size. Of course, for a large production database, I can suppose that a VM will not be as optimized as a native system. But, tn many cases, a VM could be considered as a good solution. So, I'm not sure that there is a debate around "databases inside VM or not".

  3. Back to top

    No more debate than anything else

    Feb 13, 2008 1:33 PM by Jim Leonardo

    Any service that is both using a lot CPU and a lot of memory a lot of the time isn't a terribly good candidate for virtualization. If your DB is used by a lightly used app, then its as much a candidate for virtualization than anything.

    But, I would also ask "why do this?". Most first rate RDBMS can happily run multiple databases under one server instance, so burdening the hardware with needing to run mulitple OS instances doesn't seem needed most of the time. You then get that memory/cpu time for the RDBMS. I suppose if you were really worried about things stepping on one another, this is ok, but I think that's rare. I suppose there could be more to worry about regarding ACID properties of your transactions in a virtualization scenario if you can't be sure that a disk write in a VM really is written to disk (I have no idea about that though).

    Of course, in all of this I'm thinking of production. For development, there's many good reasons to virtualize a DB server.

Exclusive Content

Ruby.rewrite(Ruby)

In this RubyFringe talk, Reginald Braithwaite writes Ruby code to read, write, and rewrite Ruby. Demos include extending Ruby with conditional expressions, call-by-name and more.

Book Except and Interview : Aptana RadRails, An IDE for Rails Development

Aptana RadRails: An IDE for Rails Development by Javier Ramírez discusses the latest Aptana RadRails IDE, a development environment for creating Ruby on Rails applications.

Fast Bytecodes for Funny Languages

Cliff Click discusses how to optimize generated bytecode for running on the JVM. Click analyzes and reports on several JVM languages and shows several places where they could increase performance.

Scott Ambler On Agile’s Present and Future

Scott Ambler, Practice Lead for Agile Development at IBM, speaks on the current status of the Agile community and practices having a look at the perspective of the Agile’s future.

Manager's Introduction to Test-Driven Development

Dave Nicolette and Karl Scotland try to introduce non-technical managers to one of the most popular Agile development techniques: Test-Driven Development (TDD).

Structured Event Streaming with Smooks

Smooks is best known for its transformation capabilities, but in this article Tom Fennelly describes how you can also use it for structured event streaming.

How to Work With Business Leaders to Manage Architectural Change

Successful architectures evolve over time to meet changing business requirements. Luke Hohmann presents how to collaborate with key members of your business to manage architectural changes.

Colors and the UI

In this article, Dr. Tobias Komischke explains how colors used in a GUI can influence our interaction with a computer and offers advice on using the appropriate colors for the interface.