Balanced Data Distributor: Improve SSIS Performance with Parallelism
The Balanced Data Distributor (BDD) is a new multithreaded data flow transform tool for SQL Server Integration Services (SSIS). It’s intended to improve performance in multi-core and multi-processor server environments by distributing data to multiple outputs.
The Balanced Data Distributor takes advantage of parallelism to speed up data transformations, so it won’t have an effect on single-processor configurations. (In fact, it could degrade performance versus a straight insert using a Script Component in SSIS.) Microsoft recommends using this particular transform only in specific circumstances. An appropriate scenario involves all of the following criteria:
- There is a large volume of data to be moved.
- The data can be read quickly (from a flat file, for example), but there is a potential bottleneck in the transformation process or the destination.
- The order of the source data does not need to be maintained (BDD splits it into roughly equal buffers).
- The destinations should all be uniform, or of the same type.
On his blog, Boyan Penev provides an introductory performance comparison of the BDD versus an insert using a Script Component. He saw a 35%-45% performance increase using a local SQL Server instance. The SQL Server Performance Team provides more information on best practices for using the Balanced Data Distributor in parallel environments.
The Balanced Data Distributor transform is currently only available for SSIS 2008.
Tyler Akidau Jan 24, 2015