Reading event data 101: What’s up with the XML?

If you explored extended events to any degree you’ve already discovered that when you read the targets you get your event data back as XML. First things first, lets clear up a popular misconception; event data is not stored as XML in the targets, it’s stored as binary. The XML is generated from the binary data when you read it either from the DMV or using the file reader function. We actually return the XML data as a varchar so you need to either CAST or CONVERT it into XML to display it as such and then you’re ready to go, right? We’ll you’re only ready to go if you want to read the raw XML, let’s see how this all works The first thing we need is some data, and to get that you’ll need to configure and start an event session. For this post I’m going to track two events, module_end and sp_statement_completed, and I’m going to write to both a ring_buffer and the asynchronous_file_target so that we can examine some differences between the two targets.

Aside: Stay on Target

Extended events offers a number of different targets to hold your event data. You can find the details of these different targets in the SQL Server Extended Events Targets BOL topic. The ring buffer and file targets are the ones that collect the “raw data” of your event session so that you can see each individual event recorded by the session. The other targets do additional analysis of the event data as they collect, but I’ll leave a discussion of those to another time.

Here is the session configuration that is used for this post:

CREATE EVENT SESSION target_reading ON SERVER
ADD EVENT sqlserver.module_end (ACTION (sqlserver.sql_text)),
ADD EVENT sqlserver.sp_statement_completed (ACTION (sqlserver.sql_text))
ADD TARGET package0.ring_buffer,
ADD TARGET package0.asynchronous_file_target
    (SET filename=N'C:\Temp\target_reading.xel')
WITH (max_dispatch_latency = 1 SECONDS)
GO

-- Start the session
ALTER EVENT SESSION target_reading ON SERVER
STATE = start
GO

Since it would be nice to have some actual data to look at, we need to run a few lines of T-SQL to generate some data:

-- Test run
USE AdventureWorks2008
GO

EXEC dbo.uspGetBillOfMaterials 316, '1/1/2006'
GO

USE master
GO
-- End test run

You can get the AdventureWorks2008 database from Codeplex if you don’t have it already.

Lets take a quick look at the data that you get from this little experiment:

-- Ring buffer
SELECT CAST(t.target_data AS XML) xml_data, *
FROM sys.dm_xe_session_targets t INNER JOIN sys.dm_xe_sessions s
    ON t.event_session_address = s.address
WHERE s.name = 'target_reading' AND t.target_name = 'ring_buffer'

-- File target
SELECT CAST(event_data AS XML) xml_event_data, *
FROM sys.fn_xe_file_target_read_file('C:\Temp\target_reading*.xel', 'C:\Temp\target_reading*.xem', NULL, NULL)

The most interesting thing to notice at this point is that you get slightly different XML from the formatted output of these two targets; the ring buffer gives you a single block on XML with all the events while the file target gives one row per event collected. You can see that we get data back from the two targets in different ways:

  • The ring buffer is read with a DMV (dm_xe_session_targets) – this same DMV is used to produce output from all in-memory targets.
  • The file is read with a function (fn_xe_file_target_read_file) – this is only used to read extended events files.

You can click on the XML field that we created with the CAST statement and open the XML for viewing, but I’m guessing this is not exactly what you’re looking for. So how do you get something more “friendly” looking?

Follow the XPath

The cold, hard truth is that you’re going to have to parse that XML. As you can tell from the title of this section, I’m going to do it by writing some XPath directly in my queries but there are others, maybe even better, ways to do this. If you’re interested in another way to accomplish this I suggest you read Adam Mechanic’s post Extended Events Code Generator v1.0 that describes a way to use a managed code parser to do this. (Adam doesn’t provide his code, but he does give you a download of the assembly.)

Aside: What’s all the Static about my Dynamic XML?

Outputting XML isn’t always the most popular thing to do in SQL Server, but it offers some distinct advantages for Extended Events, specifically, flexibility. If you’ve worked with SQL Trace you are aware that it has a static schema with a set number of columns. This has presented a couple challenges for both implementation and usage of the system. First off, there are a fixed number of columns available for use, what happens in you need one more column that the system supports? (Yeah, you go an add another column, but now you have a column that is only used by one Event Class – this start getting messy.) Second is that fields have a fixed name once they are created. This is fine for columns with names such as SPID or DatabaseID that will be generally useful for most event classes but other columns are not generally useful and they end up getting “re-tasked” based on data type and you end up with columns named ObjectID that contain SQL endpoint addresses. You also end up with columns that have descriptive names such as IntegerData and IntegerData2 that leave you guessing what they contain beyond an integer.

In Extended Events we opted for a dynamic schema where each event has it’s own schema with as many or as few columns as required for the event. Additionally, the columns names are descriptive since they don’t have to be “shared” with other events. It turns out the most practical way to produce this dynamic schema is to output it in XML.

So back to the XPath…you can use the Value property to pull data out of a node, so the process is that we get a list of nodes from our event XML and then extract the specific nodes that we’re interested in seeing. The first thing that will be useful in our efforts is a list of all the columns that are available to us, which you can get with this query:

SELECT * FROM sys.dm_xe_object_columns
WHERE column_type = 'data' AND object_name IN ('module_end', 'sp_statement_completed')
ORDER BY name

You see some duplicates in this list since some columns names are shared by both events but when you see columns with identical names in two events, it typically means they contain the same data. (e.g. The database_id column will always contain a Database ID.) The process for pulling data out of the XML is similar for both targets:

  1. Get a set of Nodes that represent the events returned from the session.
  2. Iterate over those Nodes and pull out specific “column” data that is interesting.

For this example I’ve made the adjustment of pulling the data from the ring buffer into a temporary table, and then processing it from there. This is not required, I do it so that I can persist the ring buffer data after I stop the session. (The ring buffer is an in memory target so the data goes away once you stop the session.) This isn’t necessarily a “normal” use case for the ring buffer, I’m using it here just to provide an example of how to parse XML data from this target. So here is the real point (finally) of this post:

-- Store the ring_buffer data to a temp table so we can stop the session
SELECT CAST(t.target_data AS XML) xml_data, *
INTO #ring_buffer_data
FROM sys.dm_xe_session_targets t INNER JOIN sys.dm_xe_sessions s
    ON t.event_session_address = s.address
WHERE s.name = 'target_reading' AND t.target_name = 'ring_buffer'

-- Stop the session
ALTER EVENT SESSION target_reading ON SERVER
STATE = stop

-- Parse XML from ring_buffer to show column-wise output
SELECT
    event_xml.value('(./@name)', 'varchar(1000)') as event_name,
    event_xml.value('(./data[@name="source_database_id"]/value)[1]', 'int') as source_database_id,
    event_xml.value('(./data[@name="object_name"]/value)[1]', 'sysname') as object_name,
    event_xml.value('(./data[@name="object_id"]/value)[1]', 'int') as object_id,
    event_xml.value('(./data[@name="object_type"]/value)[1]', 'varchar(25)') as object_type,
    event_xml.value('(./data[@name="duration"]/value)[1]', 'bigint') as duration,
    event_xml.value('(./data[@name="cpu"]/value)[1]', 'bigint') as cpu,
    event_xml.value('(./data[@name="row_count"]/value)[1]', 'int') as row_count,
    event_xml.value('(./data[@name="reads"]/value)[1]', 'bigint') as reads,
    event_xml.value('(./data[@name="writes"]/value)[1]', 'bigint') as writes,
    event_xml.value('(./action[@name="sql_text"]/value)[1]', 'varchar(4000)') as sql_text
FROM #ring_buffer_data
    CROSS APPLY xml_data.nodes('//event') n (event_xml)

And now for the file target:

-- Parse XML from the file to show column-wise output
SELECT
    event_xml.value('(./@name)', 'varchar(1000)') as event_name,
    event_xml.value('(./data[@name="source_database_id"]/value)[1]', 'int') as source_database_id,
    event_xml.value('(./data[@name="object_name"]/value)[1]', 'sysname') as object_name,
    event_xml.value('(./data[@name="object_id"]/value)[1]', 'int') as object_id,
    event_xml.value('(./data[@name="object_type"]/value)[1]', 'varchar(25)') as object_type,
    event_xml.value('(./data[@name="duration"]/value)[1]', 'bigint') as duration,
    event_xml.value('(./data[@name="cpu"]/value)[1]', 'bigint') as cpu,
    event_xml.value('(./data[@name="row_count"]/value)[1]', 'int') as row_count,
    event_xml.value('(./data[@name="reads"]/value)[1]', 'bigint') as reads,
    event_xml.value('(./data[@name="writes"]/value)[1]', 'bigint') as writes,
    event_xml.value('(./action[@name="sql_text"]/value)[1]', 'varchar(4000)') as sql_text
FROM    (SELECT CAST(event_data AS XML) xml_event_data
             FROM sys.fn_xe_file_target_read_file('C:\Temp\target_reading*.xel', 'C:\Temp\target_reading*.xem', NULL, NULL)) AS event_table
    CROSS APPLY xml_event_data.nodes('//event') n (event_xml)

The interesting thing to notice here is the use of the “data” versus “action” in my XPath; what’s up with that? If you read the Books Online topics about extended events you’ll learn that data can come from two different places: each event has it’s unique set of columns  but you can also retrieve “global” data, which we call Actions. We’ll post more details about this in the future, but for now it’s enough to know that the columns that are part of the event are returned in the <data> node and the “global” data is returned in the <action> node. You can see from the CREATE EVENT SESSION at the beginning of this post that I added one Action to each event, the sql_text action, that returns the t-sql command that was run from the query window to cause these events to fire.

The output from either target should be the same, or at least very similar:

event_name

source_database_id

object_name

object_id

object_type

duration

cpu

row_count

reads

writes

sql_text

sp_statement_completed

5

NULL

39671189

8272

0

0

NULL

0

0

EXEC dbo.uspGetBillOfMaterials 316, '1/1/2006'

sp_statement_completed

5

NULL

39671189

8272

0

0

NULL

16

0

EXEC dbo.uspGetBillOfMaterials 316, '1/1/2006'

module_end

5

uspGetBillOfMaterials

39671189

P

1000

NULL

1

NULL

NULL

EXEC dbo.uspGetBillOfMaterials 316, '1/1/2006'

You may get additional events recorded as well, depending on what your server is doing at the time, but these are the events generated by the test code above. (In my case, AdventureWorks2008 is DatabaseID = 5 so I can filter based on that.) So what does this mean?

  • You can see that there are to occurrences of sp_statement_completed; this is because the stored procedure being called has two separate statements being executed within it. Just like SQL Trace, you will see sp_statement_completed for each t-sql statement that is executed within a stored procedured.
  • You can see that module_end occurs once for the actual execution of the stored procedure. You can even see that stored procedure being executed in the object_name field.

What’s up with the sql_text action? Good question; I’m betting you were thinking that the sql_text action associated with the sp_statement_completed would return the text of the statement from within the stored procedure that was just completed – clearly it doesn’t. As I wrote, we’ll dive into more details on Actions in a future post, so for now we’ll just accept that this is the way it’s supposed to work. (Note: if you want to get the actual statement from within the stored procedure you can use the tsql_stack action and then extract the statement from dm_exec_sql_text. I’ll leave that for a future post as well.)

Just to finish things off, you can remove the event session by dropping it:

-- Drop the session
DROP EVENT SESSION target_reading ON SERVER

These basics should get you started reading the data from your extended events sessions and exploring the system. Have fun!

- Mike