Implement Parallel Execution in SSIS

SQL Server Integration Services (SSIS) allows parallel execution in two different ways. These are controlled by two properties as outlined below.

The first one is MaxConcurrentExecutables, a property of the package. It defines how many tasks (executables) can run simultaneously. It defaults to -1 which is translated to the number of processors plus 2. Please note that if your box has hyperthreading turned on, it is the logical processor rather than the physically present processor that is counted.

Example:

Suppose we have a package with 3 Data Flow Tasks. Each task has 10 flows in the form of “OLE DB Source -> SQL Server Destination”.

Set MaxConcurrentExecutables to 3, then all 3 Data Flow Tasks will run simultaneously.

Now whether all 10 flows in each individual Data Flow Task get started concurrently is a different story. This is controlled by the second property: EngineThreads.

The EngineThreads is a property of the Data Flow Task that defines how many work threads the scheduler will create and run in parallel. Its default value is 5.

Example:

Again let’s use the above example.

If we set EngineThreads to 10 on all 3 Data Flow Tasks, then all the 30 flows will start off at once.

One thing we want to be clear about EngineThreads is that it governs both source threads (for source components) and work threads (for transformation and destination components). Source threads and work threads are both engine threads created by the Data Flow’s scheduler. So in the above example, a value of 10 for Engine Threads means up to 10 source threads and 10 work threads.

Multi tasking is a double-edge sword. In SSIS, we don’t affinitize the threads that we create to any of the processors. So if the number of threads exceeds the number of available processors, you might end up hurting throughput due to an excessive amount of context switches. Be cautious!

- Runying Mao