Introducing SQL Server 2014's New Clustered Columnstore Indexes

| by Jonathan Allen Follow 594 Followers on Sep 26, 2013. Estimated reading time: 1 minute |

In SQL Server 2012, developers had the option to create columnstore indexes. These indexes had the potential to offer 10x performance improvements and 7x compression over traditional tables, but come with heavy restrictions. The most significant of which is the fact that they trip their underlying table into read-only mode.

A new storage engine in SQL Server 2014 overcomes that limitation. Known as a Clustered Columnstore Index, this allows for highly efficient column-ordered data while still allowing the table to operate normally when it comes to DML operations (e.g. INSERT, UPDATE, DELETE).

Just like a normal clustered index, a clustered columnstore index defines how the data is physically stored on the disc. A columnstore backed table is initially organized into segments known as row groups. Each rowgroup holds from 102,400 to 1,048,576 rows. Once a rowgroup is identified it is broken up into column segments, which are then compressed and inserted into the actual columnstore.

When dealing with small amounts of data, small being defined as less than a hundred thousand rows, the data is staged in a section known as the deltastore. Once it reaches the minimum size the deltastore can be drained, its data being processed as a new rowgroup. You can see this illustrated in the MSDN diagram below:

A deltastore will be closed while it is being converted. This, however, is not a blocking operation for the table as a whole. An additional deltastore can be created for a given table when the current deltastores are unavailable due to locking. And if the table is partitioned, then each partition gets its own set of deltastores.

A note on terminology: Microsoft is now using “rowstore” to refer to traditional tables that are arranged by rows and columns. The deltastore is actually a type of rowstore.

Unlike the previous version of columnstore indexes, the clustered version must include all columns in the table. This is because there is no other heap or clustered index to fall back on for the rest of the row. In fact, clustered columnstore indexes cannot be combined with other types of indexes at all.

Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread


Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you