Integration Services and Extended Events – A structured approach to optimize db architectures.


No matter what product or software you design it`s crucial to understand architecture tradeoffs. This blog post describes a structured approach on how to collect database performance data to make informed design decisions. Traditionally it is very difficult to find an optimal architecture for a particular workload and to test different configurations in a consistent manner. This is where integration services comes to rescue. It turns out that collecting performance data is pretty much the same as performing traditional ETL work.

Lets say you are looking for the optimal index and table structure strategy to maximize input speed. 

In such a case you can create a SSIS package that restores a number of different architectures and executes a bunch of predefined workloads against them. At the same time you collect query execution logs and parse them into SQL tables for further investigation.

Here is an example of such a Package:

image

Before you actually start the workload you restore the database configuration and start your performance trace.

Here is an example trace definition. This is created once for the entire SQL Server instance. (I would recommend using the SQL Server wizard to create your own Extended Event trace definition.)

CREATEEVENTSESSION [test] ONSERVER

ADDEVENT sqlserver.rpc_completed(SET collect_statement=(1)

    ACTION(sqlserver.transaction_id)

    WHERE ([package0].[equal_uint64]([sqlserver].[database_id],(12))))

ADDTARGET package0.etw_classic_sync_target(SET default_etw_session_logfile_path=N'C:\Temp\ETWNeu.etl',default_etw_session_logfile_size_mb=(1)),

ADDTARGET package0.event_file(SETfilename=N'C:\Temp\Demo.xel'),

ADDTARGET package0.histogram(SET filtering_event_name=N'sqlserver.sql_batch_completed',slots=(10),source=N'duration',source_type=(0)),

ADDTARGET package0.ring_buffer

WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=1 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)

GO

 

After the database is restored and before you start the workload you start the trace session:

SSIS SQL Task:

ALTER EVENT SESSION [LoadTest]ON SERVER
STATE = start;

Then you execute the workload and if it`s finished you stop the Trace and flush the data to the logfile:

SSIS SQL Task:

ALTER EVENT SESSION [LoadTest]
ON SERVER
STATE = stop;

waitfor delay '00:00:10'

The next step is to parse the logfile into a SQL Table. Here is an example:

SELECTROW_NUMBER()OVER (ORDERBY  n.value('(@timestamp)[1]','datetime2'))AS EventID,

    n.value('(@name)[1]','varchar(50)')AS event_name,

    n.value('(@package)[1]','varchar(50)')AS package_name,

    n.value('(@id)[1]','int')AS id,

    n.value('(@version)[1]','int')ASversion,

    DATEADD(hh,

            DATEDIFF(hh,GETUTCDATE(),CURRENT_TIMESTAMP),

            n.value('(@timestamp)[1]','datetime2'))AS [timestamp],

    n.value('(data[@name="error"]/value)[1]','int')as error,

    n.value('(data[@name="severity"]/value)[1]','int')as severity,

    n.value('(data[@name="duration"]/value)[1]','int')as duration,

    n.value('(data[@name="cpu_time"]/value)[1]','int')as cpu_time,

       n.value('(data[@name="statement"]/value)[1]','varchar(max)')asstatement,

    n.value('(data[@name="message"]/value)[1]','varchar(max)')asmessage

FROM

(SELECT

    CAST(event_data ASXML)AS event_data

FROMsys.fn_xe_file_target_read_file('C:\Users\lukast\Desktop\TestExtended\SQLBatches_0_130725393063440000.xel',null,null,null)

)as tab

CROSSAPPLY event_data.nodes('event')as q(n)

 

You can also use a wildcard in the filename in the fn_xe_file_target_read_file function to return all the logdata from all the files in a directory.

In the end you can use Excel Power Pivot to figure out which architecture works best for your most important workload:

clip_image001

In this real world example we realized that migrating to In-Memory tables resulted in much better Average Execution Time and much more predictable (significantly lower standard deviation) insert performance.


Comments (0)

Skip to main content