Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Database Virtualization - Is it worth it?

Database Virtualization - Is it worth it?

This item in japanese

Hosting server applications inside VM images is all the rage today. The ability to quickly move a virtual server from one machine to another as needs change is a big win for IT departments. But can this be applied to heavyweight systems like SQL Server? Conor Cunningham says no.

According to Conor, SQL Server makes several assumptions about its environment. These include:

  1. All CPUs are equally powerful
  2. All CPUs process instructions at about the same rate.
  3. A flush to disk should probably happen in a bounded amount of time.

The first issue comes into play with high-end SKUs that support parallel queries. When a query is executed, the work is evenly spilt among threads. But with both hyperthreading and virtualization, those threads are not running at a consistent speed.

So now I have some threads that finish earlier than others. So they block until the slowest threads finish. Even worse, I don't think that the query re-allocates those threads for other queries until the whole query finishes. So, now you have some background as to why hyperthreading was not recommended for at least some SQL Server deployments.

Later Conor discusses memory and I/O,

SQL Server assumes, at least in the main server SKUs, that it is the only significant memory consumer on the machine. It's a *server*. (SQLExpress has different assumptions, but it's no memory slouch either). Now, SQL Server will work in a memory constrained environment, but you often don't want to do that. You take that away from a lot of different things - the buffer pool, the compiled plan cache, memory to execute queries (for example, hash join grants). All of these things can add up if you aren't careful.

I/O is the area where I have the least experience in virtualization. This is one of the reasons I asked people about production SQL Servers. Usually they did get some storage array, and this makes sense - it ramps the I/O bandwidth and usually isolates it from any other operations on the machine (your OS, your application you are developing on top of SQL Server, etc). I'm going to spend some more time on this, but I think the core idea is sound - as you start sharing your I/O bandwidth over several VMs, you are going to hit limits earlier with big IO consumers like SQL Server. The same basic logic applies - isolate your database traffic onto different storage paths, especially when building a system to scale. In a VM world, this can let you avoid the sharing penalties vs. the default config of everyone sharing the same hard drive.

All this isn't to say that SQL Server cannot run in a virtual image, merely that if performance is critical than it probably isn't work the cost.

Rate this Article