BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News SQL Server 2016: Temporal Tables

SQL Server 2016: Temporal Tables

Bookmarks

The term “temporal data” refers to records that are versioned in the database. For any given logical record, there is a current version and zero or more prior versions. Both the current and any previous versions are represented by physical rows in the database, though not necessarily in the same table.

Data integrity is hard when working with temporal tables. Each time a row is updated, there needs to be a way to ensure the current versions of the row in copied into the prior version table. This can be done with triggers or stored procedures, but both have their problems.

Querying temporal data is likewise challenging. Though you can easily get the current version of a logical record, retrieving the version of a particular data requires a complex and often error prone query. This often leads developers to invest in databases that are specialized for this type of workload.

SQL Server 2016 is offering another alternative, the new Temporal Table object. Superficially, a Temporal Table looks just like a normal table. It supports most column types, normal indexes, columnstore indexes, foreign keys, etc. CRUD style operations work as expected using normal SQL or an ORM. In fact, you can convert most normal tables into Temporal Tables without changing the stored procedures and applications that use said table.

Implementation wise, a Temporal Table is really two tables. One table contains the current values while another handles the historic versions of the data. The tables are linked so that any UPDATE or DELETE operation in the normal table automatically creates a corresponding history row. (INSERT operations don’t create historic records.)

Accessing Historic Data

You can query a history table directly, but since it doesn’t contain the current values you wouldn’t normally touch it. Instead, you should always query the base table using one of the following operations.

  • Point in time: AS OF <date_time>
  • Exclusive bounds: FROM <start_date_time> TO <end_date_time>
  • Inclusive lower bound, exclusive upper bound: BETWEEN <start_date_time> AND <end_date_time>
  • Inclusive bounds: CONTAINED IN (<start_date_time> , <end_date_time>)

For example, if you want to know which value was active for customer 27 on the first of the year you could write:

… FROM Customer FOR SYSTEM_TIME AS OF '2015-1-1' WHERE CustomerID = 27

If instead you want to see every version of the users records for that day you could write,

… FROM Customer FOR SYSTEM_TIME BETWEEN '2015-1-1' AND '2015-1-2'WHERE CustomerID = 27

Design Rules

  • Temporal tables need to have a SysStartTime and SysEndTime column, both being a non-nullable DateTime2. These columns can be named anything and are managed by SQL Server; users are not allowed to insert or update values into these columns.
  • The column type FILESTREAM is not supported, as it stores data outside of the database.
  • For table Foo, the history table will be named “FooHistory” by default. This can be overridden.
  • The history table cannot be modified directly, you can only add to it by updating or deleting data from the current table.
  • INSTEAD OF triggers are not supported and AFTER triggers are only allowed on the current table.

Indexing must be manually enabled. Microsoft has some recommendations on this,

An optimal indexing strategy will include a clustered columns store index and / or a B-tree rowstore index on the current table and a clustered columnstore index on the history table for optimal storage size and performance. If you create / use your own history table, we strongly recommend that you create such an index that consists of the primary keys from the current table plus the period columns to speed up temporal querying as well as the queries that are part of the data consistency check. If your history table is a rowstore, we recommend a clustered rowstore index. The default history table has a clustered rowstore index created for you. At a minimum, a non-clustered rowstore index is recommended.

Schema Modifications

You cannot change the schema of a Temporal Table. You can, however, convert the Temporal Table into two normal tables using SET (SYSTEM_VERSIONING = OFF) in an ALTER TABLE statement.

Once you do that, you can modify the two tables and then turn them back into a Temporal Table with SET (SYSTEM_VERSIONING = ON). Note that this statement needs to include the name of the history table and the two sys time columns.

Correction: An earlier version of this article incorrectly stated that the FOR SYSTEM_TIME expression was part of the WHERE clause when in fact it is part of the FROM clause.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • It's about time

    by peter lin,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    a lot of people have been building bi-temporal databases on RDBMS for over three decades now. The thing is, building bi-temporal db on a RDBMS is never going to scale correctly for complex data models like property/auto insurance or financial applications. The temporal entity is an object graph that is 3 levels deep. Some times it can be 8 or 9 deep, which means saving 1 version of an auto policy produces query explosion.

    Having done this several times, I know first hand it's only practical for small databases with less than 500,000 rows. Anything more than that, querying 1 or more versions of a complex object can take minutes. It goes without saying, queries that take more than 1 minute have a material impact on response time and scalability.

  • Re: It's about time

    by Mac Noodle,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    ... Or healthcare or, well, pretty much any industry. Some data is not that complex, but most is.

  • Nice post!

    by Gustavo Maia,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Hello Jonathan.

    This is a great reading. Still there are 2 little things that are not quiet correct:

    1. In "For table Foo, the history table will be named “FooHistory” by default. This can be overridden." the default history table name is actually "MSSQL_TemporalHistoryFor_<foo object id>";

    2. In "Note that this statement needs to include the name of the history table and the two sys time columns.", the part of the 2 sys columns is not right. They are only needed when defining the table PERIOD.

    Thanks for sharing this!

  • Re: It's about time

    by weq qew,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Schema versioning is the hard part and is the missing storying from the current offering.

    We used to temporalise datasources from engineering/logistics/HR systems in the Defence space. Merging in 100gb of data a day with a around 8% data volatility. Sure it chewed through platters like they were going out of fashion when rebuilding indexes, but we had sub 10sec queries across millions of records WITH support for any kind of schema change u can think of. This was ontop of a single SQL server.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT