Performance Testing the Microsoft OLE DB Provider for DB2 (Part 2)

Technorati Tags: Data Providers,DB2,Host Integration Server,Microsoft OLE DB Provider for DB2,OLEDB,Performance,SSIS,SQL Server

In my previous entry, I covered testing SQL queries using the Microsoft OLE DB Provider for DB2 using a linked server, but many customers really need to test performance over SSIS, when moving data from DB2 to SQL Server.

 

Initially I did my test ‘the hard way’, creating multiple packages with different parameters for the RowSet Cache Size setting (which greatly increases performance), then running them, and looking at the final numbers.

 

This was crude, but not automated.

 

So, I wrote a C# application which recorded the start and stop time of the package execution, and ran this for each package.

 

watch.Start(); // start watch

pkgResults = pkg.Execute(null, null, eventListener, null, null);

watch.Stop();

 

This was better, but the only problem was still having to creating multiple packages. Next, I looked at creating one package, but having multiple configuration files. Again, workable, but if you wanted to test 10 settings, that would mean 10 configuration files.

 

So I did some research (and asking other Engineers more experienced with SSIS) that it is possible to change a package on the fly (so to speak), and execute it. So I began work on that.

 

First I had to decide on what needed to be changed. I knew I wanted to try out various settings for the “Rowset Cache Size”, and I also wanted to test a different type query than in the previous test, but also limit the number of returned rows. The table I decided to use contains 8.3 million rows, and the query I decided to use against this table was this: “select * from schema.table where amount BETWEEN 5.00 AND 10.00”. This particular query would return 3,581,029 rows from the table.

 

Once I decided on what needed changing, I created a new Integration Services Project in Visual Studio (or BIDS). To make things easier (and simpler) I ran the SSIS Import Export Wizard to create the standard package I wanted to use. Before making any changes to the package, I ran the package to ensure it functioned properly. (Note: When creating the package, I set it to drop and recreate the destination tables in SQL Server)

 

Once I was sure that the package ran correctly, I had to decide on how to modify the things. After doing a bit of research, I decided to use a variable in the package to hold the query string, and to modify the actual connect string from within my visual studio application.

 

On the control flow I created a variable called “varQueryString”, with a value type of string, and set its initial value to the query above. Then on the source data flow I changed the data access mode to “SQL command from variable” and set the Variable name to point to my varQueryString.  I then tested the package again to ensure everything was correct.

 

At this point things were ready to start making creating a control application that would automate multiple tests.

 

I used Visual Studio 2008 and created a new C# console application. I then added a reference to Microsoft.SQLServer.ManagedDTS. Also, I had decided I wanted each ‘run’ to be clean, by stopping and starting the SQL and SSIS services, and was going to be logging all my results to SQL Server.

 

So, in my C# file, I setup the following using directives.

 

using System;

using System.Text;

using System.Diagnostics;

using Microsoft.SqlServer.Dts.Runtime;

using System.Data.SqlClient;

using System.ServiceProcess;

using System.Threading;

using System.Data;

 

System.Threading I added after I found the application was trying to start the packages before the SQL and SSIS services were fully running, which would cause issues. My final application sleeps in a few areas to ensure the 2 services are fully operational. This you may need to modify on your particular test machines (and may not be necessary in all cases).

 

The application itself contains 2 global variables:

 

const string pkgFullPath = @"C:\projects\SQLPKG\PerfTest.dtsx";

static int TestCounter = 0; // counter to hold total test runs

 

The 1st contains the path information to the package, the 2nd a counter which is used to hold the # of packages that have been executed.

 

The main procedure is simple:

 

static void Main(string[] args)

{

       CreateSQLPerfTable();

       for (int x = 0; x < 2; x++)

       {

              RunSSISTests();

       }

}

 

The first call to CreateSQLPerfTable () creates a table in SQL Server to hold the test results. The “for” loop will run the test twice (in this case), which would allow an average of the test runs to be gathered. The procedure “RunSSISTests()” begins by setting up some local variables, then sets up an event listener to catch any errors that may be generated when the package is run:

 

MyEventListener eventListener = new MyEventListener();

 

Next the procedure creates a timer for timing the package execution time, and loads the package in.

 

Stopwatch watch = new Stopwatch();

app = new Application();

pkg = app.LoadPackage(pkgFullPath, null);

 

After this, we need to access the source connection object in the package, which is handled as such:

// need to access the source connection object in the package

Connections myConns = pkg.Connections;

ConnectionManager connMgr = myConns["SourceConnectionOLEDB"];

 

Note: if your Source Connection is named differently, you will need to adjust the above appropriately.

 

Next, an outer loop is started, which will run 4 times, in each test modifying the package variable varQueryString. We begin by querying for 10,000 rows, increasing each run until the entire data is pulled in.

 

for (int iQuery = 1; iQuery < 5; iQuery++)

{

       switch (iQuery)

       {

              // default query string for variable varQueryString

              // select * from schema.table where amount BETWEEN 5.00 AND 10.00

              case 1: // 10,000 rows

                     QueryCommand = "select * from schema.table where amount BETWEEN 5.00 AND 10.00 FETCH FIRST 10000 ROWS ONLY";

                     break;

              case 2: // 100,000 rows

                     QueryCommand = "select * from schema.table where amount BETWEEN 5.00 AND 10.00 FETCH FIRST 100000 ROWS ONLY";

                     break;

              case 3: // 1,000,000 rows

                     QueryCommand = "select * from schema.table where amount BETWEEN 5.00 AND 10.00 FETCH FIRST 1000000 ROWS ONLY";

                     break;

              case 4: // all rows (this table has ~8.3 million rows, where clause pulls in 3581029)

                     QueryCommand = "select * from schema.table where amount BETWEEN 5.00 AND 10.00 ";

                     break;

       }

       pkg.Variables["varQueryString"].Value = QueryCommand;

 

After the package variable has been modified, the connection string needs to be changed. To get the initial value, from VS I opened up the package, went to the “SourceConnectionOLEDB” properties, and copied out the “ConnectionString”. I then cut the “Rowset Cache Size = 0” value from this, and added “Rowset Cache Size=” to the end of the string. Then, in my next ‘inner’ loop, I append the value of the rowset values I want to test. In my case, I want to test 8 separate values to see which one would give me the best performance.

 

      for (int i = 1; i < 9; i++)

       {

              // setup connection string

              switch (i)

              {

                     case 1: // rowset cache size = 0

                           rowset = 0;

                           break;

                     case 2: // rowset cache size = 30

                           rowset = 30;

                           break;

                     case 3: // rowset cache size = 50

                           rowset = 50;

                           break;

                     case 4: // rowset cache size = 100

                           rowset = 100;

                           break;

                     case 5: // rowset cache size = 150

                           rowset = 150;

       break;

                     case 6: // rowset cache size = 200

                           rowset = 200;

                           break;

                     case 7: // rowset cache size = 250

                           rowset = 250;

                           break;

                     case 8: // rowset cache size = 300

                           rowset = 300;

                           break;

              }

              connMgr.ConnectionString =

                     "Data Source=DB2400;User ID=USER;Initial Catalog=DB2400;Provider=DB2OLEDB;Persist Security Info=True;Network Address=DB2400;Package Collection=SCHEMA;Default Schema=SCHEMA;Default Qualifier=SCHEMA;DBMS Platform=DB2/AS400;Rowset Cache Size="

                     + rowset.ToString() + ";";

 

Initially, at this point, I thought everything was good. But I discovered it was not. Anyone that is familiar with editing SSIS packages knows that after opening the Source or Destination objects that in many cases saved passwords are ‘lost’. Well, this apparently happens when using code also. When I changed the ConnectionString, the password was ‘lost’, so I had to go back into the code, and hardcode the password. This could be set in a configuration file, if you wish, but for my testing purposes, I did not care if the password is hard coded.

 

              // now we need to setup the password, as this get's lost for some reason

              DtsProperties connProperties = connMgr.Properties;

              DtsProperty myconnProp = connProperties["Password"];

              myconnProp.SetValue(connMgr, "password");

 

Once the password is set, the SQL and SSIS services are stopped and then restarted. The package is ran, incrementing the testcounter value. Results are sent to the SQL Server, and the next iteration is run:

 

                  TestCounter++;

                     Console.WriteLine("Starting test #{0}", TestCounter);

                     watch.Start(); // start watch

                     pkgResults = pkg.Execute(null, null, eventListener, null, null);

                     watch.Stop();

                     Console.WriteLine("Finished Test #{0}, Query: {1}, Rowset: {2}, Duration: {3}", TestCounter, QueryCommand, rowset, watch.Elapsed.TotalMilliseconds);

                     UpdateSQLPerfTable(TestCounter, QueryCommand, rowset, (int)watch.Elapsed.TotalMilliseconds);

                     watch.Reset();

                     pkg.Dispose();

 

After the tests are complete, the following SQL Script can be used to create a pivot chart which shows the output of the test runs, averaging the duration times to get a better idea of which setting may be best in your environment:

 

-- SSISPerf Pivot report

-- ***********************

-- Builds a Pivot report

USE [DB2TestDatabases]

go

declare @comstr nvarchar(4000) -- command string

declare @success int -- variable for executing SQL commands

declare @commands nvarchar(255) --

declare @tempstr nvarchar(500)

declare command_cursor CURSOR FOR

       select distinct QueryCommand

       from [dbo].SSISPerfTrials

set @tempstr = ''

OPEN command_cursor

FETCH NEXT FROM command_cursor into @commands

WHILE @@FETCH_STATUS = 0

BEGIN

       set @tempstr = @tempstr + '[' + @commands + ']'

FETCH NEXT FROM command_cursor into @commands

       if @@FETCH_STATUS = 0

       begin

              set @tempstr = @tempstr + ', '

       end

END

CLOSE command_cursor

DEALLOCATE command_cursor

set @comstr = 'select rowset, ' + @tempstr + ' from (select rowset, QueryCommand, durationms '

set @comstr = @comstr + 'from dbo.SSISPerfTrials) AS rf PIVOT ( avg(durationms) FOR QueryCommand IN ('

set @comstr = @comstr + @tempstr + ')) AS p'

exec @success=sp_executesql @comstr

 

In my next entry, I will demonstrate a C# test application that reads data from DB2, which also records the performance data to SQL Server.

 

Below is the full C# code for this test application. It contains some additional statements not discussed above, mainly around creating and updating the SQL tables, and the stopping and starting of the services.

 

using System;

using System.Text;

using System.Diagnostics;

using Microsoft.SqlServer.Dts.Runtime;

using System.Data.SqlClient;

using System.ServiceProcess;

using System.Threading;

using System.Data;

namespace VS2008SSISPackageRun

{

       class MyEventListener : DefaultEvents

       {

              public override bool OnError(DtsObject source, int errorCode, string subComponent,

                     string description, string helpFile, int helpContext, string idofInterfaceWithError)

              {

                     // Add application-specific diagnostics here.

                     Console.WriteLine("Error in {0}/{1} : {2}", source, subComponent, description);

                     return false;

              }

       }

       class Program

       {

              // globals

              const string pkgFullPath = @"C:\projects\SSIS_EMIS_IMPORT\SSIS_EMIS_IMPORT\TableVariables.dtsx";

              static int TestCounter = 0; // counter to hold total test runs

              static void Main(string[] args)

              {

                     CreateSQLPerfTable();

                     for (int x = 0; x < 2; x++)

                     {

                           RunSSISTests();

                     }

              }

              static void CreateSQLPerfTable()

              {

                     SqlConnection mySqlConnection = new SqlConnection("Persist Security Info=True;Integrated Security=SSPI;database=DB2TestDatabases;server=(local);pooling=false");

                     SqlCommand mySqlCommand = mySqlConnection.CreateCommand();

                     mySqlConnection.Open();

                     // drop perf table

                     Console.WriteLine("Dropping SSISPerfTrials table");

                     mySqlCommand.CommandText =

                           "IF EXISTS (SELECT * FROM sys.objects " +

                           "WHERE object_id = OBJECT_ID(N'[dbo].[SSISPerfTrials]') " +

                           "AND type in (N'U')) " +

                           "DROP TABLE [dbo].[SSISPerfTrials]";

                     int result = mySqlCommand.ExecuteNonQuery();

                     Console.WriteLine("mySqlCommand.ExecuteNonQuery() = " + result);

                     // create new table

                     mySqlCommand.CommandText =

                           "CREATE TABLE dbo.SSISPerfTrials(" +

                           "TestNumber int NULL," +

                           "QueryCommand Nvarchar(2000) NULL," +

                           "rowset int NULL," +

                           "durationMs int NULL" +

                           ")";

                     Console.WriteLine("Creating dbo.SSISPerfTrials table");

                     result = mySqlCommand.ExecuteNonQuery();

                     Console.WriteLine("mySqlCommand.ExecuteNonQuery() = " + result);

                     mySqlConnection.Close();

              }

              static void UpdateSQLPerfTable(int testno, string QueryCommand, int rowset, int durationMS)

              {

                     // added pooling=false to the connection string to resolve an error condition

                     // may not be needed when SSIS takes a while, but on 'fast' imports appears to be needed

                     SqlConnection mySqlConnection = new SqlConnection("Persist Security Info=True;Integrated Security=SSPI;database=DB2TestDatabases;server=(local);pooling=false");

                     SqlCommand mySqlCommand = mySqlConnection.CreateCommand();

                     mySqlConnection.Open();

                     Console.WriteLine("Connection state is " +

                           mySqlConnection.State.ToString());

                     Console.WriteLine("inserting data into SQL");

                     mySqlCommand.CommandText = @"insert into dbo.SSISPerfTrials " +

                           "(TestNumber,QueryCommand,rowset,durationMS) values (" +

                           testno.ToString() + ",'" +

              QueryCommand + "'," +

                           rowset.ToString() + "," +

                           durationMS.ToString() + ")";

                     int result = mySqlCommand.ExecuteNonQuery();

                     Console.WriteLine("mySqlCommand.ExecuteNonQuery() = " + result);

                     mySqlConnection.Close();

              }

              static void RunSSISTests()

              {

                     bool bgood = false;

                     Package pkg;

                     Application app;

                     DTSExecResult pkgResults;

                     int rowset = 0;

                     string QueryCommand = "";

                     MyEventListener eventListener = new MyEventListener();

                     Stopwatch watch = new Stopwatch();

              app = new Application();

                     pkg = app.LoadPackage(pkgFullPath, null);

                     // need to access the source connection object in the package

                     Connections myConns = pkg.Connections;

                     ConnectionManager connMgr = myConns["SourceConnectionOLEDB"];

                     for (int iQuery = 1; iQuery < 5; iQuery++)

                     {

                           switch (iQuery)

                           {

                                  // default query string for variable varQueryString

                                  // select * from schema.table where amount BETWEEN 5.00 AND 10.00

                                  case 1: // 10,000 rows

                                         QueryCommand = "select * from schema.table where amount BETWEEN 5.00 AND 10.00 FETCH FIRST 10000 ROWS ONLY";

                                         break;

                                  case 2: // 100,000 rows

                                         QueryCommand = "select * from schema.table where amount BETWEEN 5.00 AND 10.00 FETCH FIRST 100000 ROWS ONLY";

                                  break;

                                  case 3: // 1,000,000 rows

                                         QueryCommand = "select * from schema.table where amount BETWEEN 5.00 AND 10.00 FETCH FIRST 1000000 ROWS ONLY";

                                         break;

                                  case 4: // all rows (this table has ~8.3 million rows, where clause pulls in 3581029)

                                         QueryCommand = "select * from schema.table where amount BETWEEN 5.00 AND 10.00 ";

                                         break;

                           }

                           pkg.Variables["varQueryString"].Value = QueryCommand;

                           for (int i = 1; i < 9; i++)

                           {

                                  // setup connection string

                                  switch (i)

                                  {

                                         case 1: // rowset cache size = 0

                                                rowset = 0;

                                                break;

                                         case 2: // rowset cache size = 30

                                                rowset = 30;

                                                break;

                                         case 3: // rowset cache size = 50

                                                rowset = 50;

                                                break;

                                         case 4: // rowset cache size = 100

                                                rowset = 100;

                                                break;

                                         case 5: // rowset cache size = 150

                                                rowset = 150;

                                                break;

                                         case 6: // rowset cache size = 200

                                                rowset = 200;

                                                break;

                                         case 7: // rowset cache size = 250

                           rowset = 250;

                                                break;

                                         case 8: // rowset cache size = 300

                                                rowset = 300;

                                                break;

                                  }

                                  connMgr.ConnectionString =

                                         "Data Source=DB2400;User ID=USER;Initial Catalog=DB2400;Provider=DB2OLEDB;Persist Security Info=True;Network Address=DB2400;Package Collection=SCHEMA;Default Schema=SCHEMA;Default Qualifier=SCHEMA;DBMS Platform=DB2/AS400;Rowset Cache Size="

                                         + rowset.ToString() + ";";

                                  // now we need to setup the password, as this get's lost for some reason

                                  DtsProperties connProperties = connMgr.Properties;

                                  DtsProperty myconnProp = connProperties["Password"];

                                  myconnProp.SetValue(connMgr, "password");

                                  // now stop and start the sql servers so things are 'clean'

                                  bgood = StopSql();

                                  if (bgood)

                                  {

                                         // hard coded timing delay to allow services

                                         // to clean up and stop properly

                                         Thread.Sleep(15000);

                                         bgood = StartSQL();

                                  }

                                  if (bgood)

                                  {

                                         // wait an additional 15 seconds just in case

                                         // the services have not started completly

                                         Thread.Sleep(15000);

                                         //Console.WriteLine("Starting Test {0}", pkgFullPath);

                                         TestCounter++;

                                         Console.WriteLine("Starting test #{0}", TestCounter);

                                         watch.Start(); // start watch

                                         pkgResults = pkg.Execute(null, null, eventListener, null, null);

                                         watch.Stop();

                                         Console.WriteLine("Finished Test #{0}, Query: {1}, Rowset: {2}, Duration: {3}", TestCounter, QueryCommand, rowset, watch.Elapsed.TotalMilliseconds);

                                         UpdateSQLPerfTable(TestCounter, QueryCommand, rowset, (int)watch.Elapsed.TotalMilliseconds);

                                         watch.Reset();

                                         pkg.Dispose();

                           }

                                  else

                                  {

                                         break; // exit for loop, services stop/start failed

                                  }

                                  //Console.WriteLine(pkgResults.ToString());

                           } // for loop

                     }

                     pkg.Dispose();

              }

              static bool StopSql()

              {

                     bool bgood = true;

                     ServiceController sqlsvc = new ServiceController("MSSQLSERVER");

                     ServiceController dtssvc = new ServiceController("MsDtsServer");

                     try

                     {

                           TimeSpan timeout = TimeSpan.FromMilliseconds(15000);

                           if (sqlsvc.Status.Equals(ServiceControllerStatus.Running))

                           {

              Console.WriteLine("Stopping MSSQLSERVER");

                                  sqlsvc.Stop();

                                  sqlsvc.WaitForStatus(ServiceControllerStatus.Stopped, timeout);

                           }

                           if (dtssvc.Status.Equals(ServiceControllerStatus.Running))

                           {

                                  Console.WriteLine("Stopping MsDtsServer");

                                  dtssvc.Stop();

                                  dtssvc.WaitForStatus(ServiceControllerStatus.Stopped, timeout);

                           }

                     }

                     catch (Exception err)

                     {

                           Console.WriteLine("An error occurred while trying to stop service: " + err.Message);

                           bgood = false;

                     }

                     return bgood;

              } // end StopSql()

              static bool StartSQL()

              {

                     bool bgood = true;

                     ServiceController sqlsvc = new ServiceController("MSSQLSERVER");

                     ServiceController dtssvc = new ServiceController("MsDtsServer");

                     try

                     {

                           TimeSpan timeout = TimeSpan.FromMilliseconds(15000);

                           Console.WriteLine("Starting MSSQLSERVER");

                           sqlsvc.Start();

                           sqlsvc.WaitForStatus(ServiceControllerStatus.Running, timeout);

                           Thread.Sleep(15000);

                           Console.WriteLine("Starting MsDtsServer");

                           dtssvc.Start();

                           dtssvc.WaitForStatus(ServiceControllerStatus.Running, timeout);

                     }

                     catch (Exception err)

                     {

                           Console.WriteLine("An error occurred while trying to start service: " + err.Message);

                           bgood = false;

                     }

                     return bgood;

              } // end StartSQL()

       }

}