BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Query Optimizer Improvements in SQL Server 2016

Query Optimizer Improvements in SQL Server 2016

This item in japanese

Bookmarks

While the last two versions of SQL Server focused on improving performance by offering new features, SQL Server 2016 looks inwards towards improving existing functionality.

Cardinality Estimator

At the core of any query optimizer is the cardinality estimator. This looks at the statistics on the tables being queried and the operations being performed to estimate the number of rows for each step in the query’s execution plan. As experienced DBAs know, incorrect cardinality estimates can be devastating to the database’s performance. Potential issues include,

  • Choosing the wrong index
  • Choosing the wrong join operator (e.g. nested loops, hash, merge)
  • Allocating too much memory, which can block other queries
  • Allocating too little memory, causing the excess data to spill to tempdb

Given how important this is, you may be surprised to learn that SQL Server 2012 used essentially the same cardinality estimator that was introduced in 1998 with SQL Server 7. It was only 2 years ago that we saw the “first major redesign of the SQL Server Query Optimizer cardinality estimation process”. You can learn more about that version in depth from the whitepaper titled Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator.

SQL Server 2016 builds upon that work with the promise of even more accurate estimates when you set the compatibility level to 130. However, regressions are possible so you should thoroughly test existing databases before changing the compatibility level in production.

Compatibility Levels

If you are unfamiliar with the term, compatibility levels in SQL Server perform two crucial roles. First, they determine which database features will be available. By setting the database to a lower compatibility level, you lose the access to newer features. This usually isn’t a problem during upgrades, as the database was designed with the older feature set in mind.

The other thing compatibility levels control is which query optimizer and cardinality estimator you get. In order to reduce the chance of regressions in finely tuned databases, you can force SQL Server to use the query optimizer from an older version by selecting a lower compatibility level.

In some cases you do have finer-grained control. For example, SQL Server 2016 will allow you to use compatibility level 130, giving you access to all of the new features, while still using the old cardinality estimator. This can be done with the below command.

ALTER DATABASE SCOPED CONFIGURATION
SET LEGACY_CARDINALITY_ESTIMATION = ON;

With the exception of SQL Server 2008 R2, each version’s compatibility level number is 10 greater than the previous version. That puts SQL Server 2000 at 80 and SQL Server 2016 at 130. Each version of SQL Server supports at least the 2 previous editions.

Multi-threaded Inserts

Prior to SQL Server 2016, Insert-Select statements could only be multi-threaded during the select phase with the actual insert operation being serialized. Now the insert operation can also be “multi-threaded or can have a parallel plan”.

Memory-Optimized Tables

Memory optimized tables have also gained the ability to be multi-threaded.

Statistics

SQL Server 2016 brings two changes to statistics. First is that they are updated more frequently when working with large tables.

In the past, the threshold for amount of changed rows that triggers auto update of statistics was 20%, which was inappropriate for large tables. Starting with SQL Server 2016 (compatibility level 130), this threshold is related to the number of rows in a table – the higher the number of rows in a table, the lower the threshold will become to trigger an update of the statistics. Note that this behavior was available under Trace Flag 2371 in previous releases.

For example, if a table had 1 billion rows, under the old behavior it would have taken 200 million rows to be changed before auto-stats update kicks in. In SQL Server 2016, it would take only 1 million rows to trigger auto stats update.

Continuing the theme of parallelization, statistics are now “sampled by a multi-threaded process” when using compatibility level 130.

Foreign Key Constraints

One of the major selling points for relational databases is the ability to associate a table with other tables while using foreign key constraints to ensure data integrity. But that has some overhead, so in SQL Server 2014 and earlier, you were limited to 253 foreign key constraints per table.

This may sound like a large number to you, but when you start looking at have audit columns such as “CreatedByKey” you can quickly hit that limit in a large database. To alleviate this issue, Microsoft has increased that limit to 10,000 incoming foreign key constraints. This means you can have thousands of tables all reference your one user table. But there are caveats.

First, this does not apply to outgoing foreign key constraints. Nor does it apply to self-referencing tables. Those are still limited to a couple hundred foreign keys.

Furthermore, the table being referenced cannot be modified using MERGE operations; only DELETE and UPDATE operations are allowed. (Technically SELECT and INSERT are also allowed, but the documentation doesn’t mention them because they aren’t affected by incoming foreign key constraints.)

Side note: the word “relational” in “relational database management system” doesn’t actually refer to relationships between tables. Rather, it is a data science term referring to how each value in a row is related to all of the other values in the same row. A pivot table, where each cell is an independent sum or average, is an example of a non-relational table.

Joins and Foreign Key Constraints

As mentioned before, foreign key constraints are not without a cost. If you modify a row that is potentially referenced by a foreign key constraint, checks need to be performed to ensure the constraint isn’t violated.

In SQL Server 2014, this check would be conducted by joining to each table that references the table in question. As you can imagine, this can quickly become quite expensive. To address this, SQL Server has introduced a new Referential Integrity Operator.

The new query execution operator does the referential integrity checks in place, by comparing the modified row to the rows in the referencing tables, to verify that the modification will not break the referential integrity. This results in much lower compilation times for such plans and comparable execution times.

Rate this Article

Adoption
Style

BT