BT

Whitepaper Released: Sharding with SQL Azure

by James Vastbinder on Dec 14, 2010 |

Yesterday Microsoft released a new whitepaper providing guidance on sharding with SQL Azure written by Michael Heydt and Michael Thomassy.  As SQL Azure currently has a limit of 50GB per instance to scale horizontally to larger sizes, one must employ this technique of horizontal partitioning to achieve application scale-out.  The intent of the whitepaper is to deliver guidance on how to architect an application that requires elasticity and fluidity of resources at the data layer over time.

The whitepaper provides:

  • basic concepts in horizontal partitioning / sharding
  • an overview of patterns and best practices
  • challenges which may present themselves
  • high-level design of an ADO.NET sharding library
  • an introduction to SQL Azure Federations

While horizontal partitioning splits one or more tables by row, it is usually within the same database instance.  The advantage achieved is reduced index size which, in theory, provides faster retrieval rates for data.  In contrast, sharding tackles the same problem by splitting the table across multiple instances of the database which would typically reside on separate hardware requiring some form of notification and replication to provide synchronization between the tables.

In the Microsoft sharding pattern a “sharding key” is used to map data to specific shards which is the primary key in one of the data entities.  Related data entities are clustered into a related set based upon the shared shard key and this unit is referred to as an atomic unit.  All records in an atomic unit are stored in the same shard.  Additionally, the process of rebalancing shards should be an offline process due to key rebalancing as the physical infrastructure is modified.

Microsoft will release SQL Azure Federations which will support sharding at the database level in 2011.  At this time all sharding capabilities must be implemented at the application level using ADO.NET.  This is in contrast to current “NoSQL” alternatives like MongoDB, CouchDB, SimpleDB which support sharding already. 

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT