InfoQ

News

New SQL Server Data Type: HierarchyId

Posted by Jonathan Allen on Nov 29, 2007 06:22 AM

Community
.NET
Topics
Data Access
Tags
SQL Server 2008

Tree-like hierarchies has always a problem in relational databases. Microsoft's first attempt at addressing this was in SQL Server 2005 with the implementation of Common Table Expressions.

While CTEs work over existing schema, Microsoft has sought a way to treat hierarchies as first-class concepts. To this effect, they have introduced the HierarchId data type in SQL Server 2008.

In traditional hierarchies, a row merely stores a reference to its parent node. This allows a row to know its relative position in the hierarch, but not its absolute position. Changing the parent is an atomic update that does not affect any child rows.

A HierarchyId field stores the rows exact position in the row in the hierarchy. An example provided by Denny Cherry shows values of 0x, 0x58, and 0x5AC0. These have string representations of /, /1/, and /1/1/ respectively. This leads to questions about consistency and performance, especially when changing the parent of a row. Ravi S.Maniam recommends that databases be designed such that reparenting occurs infrequently.

The HierarchyId field comes with a whole host of functions. GetAncestor and GetDescendant can be used to walk the tree. ToString and Parse methods are used for converting between the binary and string representations of the field. Strangely, there are also methods for supporting BinaryReaders and BinaryWriters.

Going back to the GetDescendant, it is a rather curious method. It does not actually return child rows, but rather it returns locations of potential child nodes. To insert a new row into the tree, it must be called the get the slot of the last child for the given parent, then a second time to get an empty slot just after that slot.

So far there aren't any good examples of actually retrieving a tree using T-SQL. In fact, everything about it has a imperative rather than set-based feel to it.

Data Type vs special functionality by Alex Popescu Posted Nov 29, 2007 4:30 PM
  1. Back to top

    Data Type vs special functionality

    Nov 29, 2007 4:30 PM by Alex Popescu

    I am not a DB guy, and I have to confess that this sounds pretty weird. I can see how it addresses part of the problem, but I cannot imagine very good solutions for solving the reparenting problem (well, I may be imagining a couple, but as I said I am not into DBs too much). Personally, I find Oracle approach (introducing new SQL constructs to deal with hierarchies) a more scalable approach. ./alex -- .w( the_mindstorm )p. Alexandru Popescu Senior Software Eng. InfoQ Techlead/Co-founder

Educational Content

Bindings, Platforms, and Innovation

This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.

Orchestrating Long Running Activities with JBoss / JBPM

This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.

Neo4j - The Benefits of Graph Databases

This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.

Realistic about Risk: Software development with Real Options

This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.

Communication Flexibility Using Bindings

This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.

Writing DSLs in Groovy

After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.

Scaling Agile with C/ALM (Collaborative Application Lifecycle Management)

IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.

Concurrent Programming with Microsoft F#

Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.