It’s been awhile since I’ve had a technically focused blog post, so I’m rectifying that today. Lately I’ve been doing a lot of work with Event Tracing for Windows (henceforth called “ETW” for brevity’s sake.)
For the spelunker who wants to see the OS in operation, or for the developer trying to pin down exactly what happened, ETW provides you a ton of useful information. Unfortunately, working with ETW isn’t as simple as it could be. There are quite a few concepts to wrap your head around, and it’s quite easy to get lost in the weeds. What I’m doing here is providing just a high level tour of the essential concepts. If you want to play with ETW yourself, you’ll obviously need to refer to the MSDN documentation.
At heart, ETW is a high performance logging mechanism that usable from both user and kernel mode. There are APIs for producing events as well as consuming events. The OS and various components such as IIS define quite a number of useful event categories, and you can create your own custom events.
The ETW API has been around for a number of years, so it’s implemented in unmanaged code. As I write this, there are no shipping managed libraries to make working with it simpler. Internally, various groups here are looking at the best way to make ETW a first class citizen in the managed world.
In my mind, ETW subdivides into three parts:
- Producing events
- Consuming events
- Discovering the format of events
For this blog entry we’ll look at the discovery portion, since this is the fastest way of seeing for yourself what sort of information ETW can provide to you. If you’re still interested, you can read future blog entries on producing and consuming events.
Let’s start with a very brief overview of events. ETW events are logged by providers which are registered with the system. Each provider has a descriptive name and is uniquely identified to the system by a GUID. Some typical provider names are:
A given provider emits one or more events. Each event has its own GUID and a descriptive name. Typical event names include:
So far, so good. But here’s where it gets interesting. In addition to a GUID, each event has an integer EventType field. A given event usually has multiple EventType fields. Think of each unique EventType field as a derivation from a base class. For instance, AspNetTraceEvent has several dozen different EventType values, including:
AspNetStart = 1
AspNetStop = 2
AspNetAppDomainEnter = 7
AspPageInit = 21
Put another way, an events GUID indicates the general category of the event (e.g., an HTTP Request), while the EventType field gives you exactly what the event represents (e.g., an app domain being entered, or a page being initialized.)
If you were to turn on the HttpEtwTrace provider and then examine the logged events, you’d see potentially hundreds of HttpRequest events for a single request for a web page. Only by examining the EventType field would you be able to infer exactly what the event represents.
Each distinct event (as identified by a GUID and EventField value) has a binary data format which can be interpreted. All events, no matter which provider they came from, begin with a standard header which includes fields like a process ID, a thread ID, and a timestamp. After the header, different events are free to put whatever additional data they’d like. For an ASP.NET event, this might include the URL being requested or a connection ID.
I admit that this is somewhat confusing at first. To make matters even more complex, different providers categorize their events in different ways. It’s a spectrum, really. On one end you might have a single event GUID with many EventTypes, on the other end you could have a different event GUID for every action you log, and effectively ignore the EventType field.
As we’ll soon see, event providers cover all parts of the spectrum. The HttpRequest provider uses only one GUID for all its events and multiple EventFields. The MSNT_SystemTrace has 18 different event GUIDs, with each GUID having roughly four EventType values. Finally, SQL Server has hundreds of event GUIDs, with each GUID using only a single EventType.
Another interesting piece of the ETW story is how you can programmatically discover the layout of the fields which follow the standard event header. That is, you can query the system for the names and data types of the optional fields in an event. To do so requires you to descend a bit into WMI land.
WMI is the acronym for Windows Management Instrumentation. WMI has fairly extensive capabilities, but of interest here is that WMI has an object hierarchy that represents many different aspects of the system. One particular branch of the WMI object hierarchy contains information about ETW events.
Nothing requires an event provider to describe its event in the WMI object hierarchy. If you’re writing an event provider and don’t care if anybody else can interpret your events, there’s no need to describe them in the WMI data. The event consumption API will hand you a pointer to the event, and it’s up to you to know how the data fields are encoded.
Let’s see how we can learn what providers and events are registered in the WMI hierarchy. The easiest way to do this is with CIM Studio, which is part of the WMI SDK. From this point forward, I’ll assume that you’ve downloaded and installed the WMI SDK.
First, start up “WMI CIM Studio”, which is hosted inside Internet Explorer. If you’re running a newer OS such as XP SP2, you may get the warning that IE has blocked active content. If so, allow IE to show the content. You should then get a dialog box prompting you to “connect to a namespace”. The default is “root\CIMV2”. You’ll need to change that to “root\wmi”, then hit “OK”.
Another dialog should appear, entitled “WMI CIM Studio Login”. I’m able to just hit “OK’, and shortly thereafter get the screen shown here:
The left side of the window has a treeview hierarchy, while the right side shows the properties of the currently selected treeview node. In the treeview, locate the top level object named “EventTrace”, and expand it:
Depending on which OS version you’re running, and which software you have installed, you’ll see any number of sub-nodes. In the screenshot above, the sub-nodes are:
Each of these sub-nodes corresponds to an event provider. With one of the providers selected, right-click in the right-hand properties pane, and select “Object Qualifiers…”
You should get a dialog like this:
The Guid field value is the event provider’s GUID that was registered with ETW. The Description field provides more info about the provider. In this case, the Description field indicates that this provider is the “Windows Kernel Trace”. You can now cancel out of that dialog.
Next, expand one of the providers. For our example, expand the MSNT_SystemTrace node. You should see something like this:
Each of these sub-nodes (e.g., EventTraceEvent, PageFault, UdpIp, etc…) is an event with a GUID associated with it. Highlight one of them (in this case, Image), right-click in the properties pane, and again select “Object Qualifiers…”. You should see something like this:
Cancel out the Object Qualifiers dialog, and then expand the “Image” node. It has one sub-node, named Image_Load. Looking at the right hand properties pane you should see something like this:
Notice at the top that there are fields named FileName, ImageBase, ImageSize and ProcessId. These are fields that will be represented in the event’s optional data that appears after the standard header.
Right click in the properties pane again, select “Object Qualifiers…”, and you should see this:
The crucial field here is the “EventType” field. In this instance, it’s 10. Thus, when you see a raw ETW event blob with the GUID specified by the Image object, and an EventType of 10, you’ll know that it has the four fields (FileName, etc..) listed above.
Another key point: In this case, the Image object had only one child. However, other events could have multiple children. For instance, if you look at the HttpRequest event from the HttpEtwTrace provider, you’d see this:
There are seven children of the HttpRequest event. If you were to select each one of them, and view the Object Qualifiers, you’d see that they all have different EventType values. For instance, HttpReceiveRequest has an EventType of 1.
To further complicate matters, one of these children might not specify an EventType directly. This is the case when the object is describing multiple events with the same data format, but with different EventTypes. For instance, select HttpSendComplete, and view it’s object qualifiers:
Notice that the EventType and EventTypeName values are arrays. Parallel arrays, to be more precise. Clicking on the “Array” button for the EventTypeName value, you should see this:
There are five separate values (end, CacheAndSend, etc…). Each of these names matches up with the entries in the EventType array:
What’s the deal here? Essentially, these arrays allow for a more compact encoding of events that share the same layout.
Whew! I’ve covered a lot of ground here, and not even in that much detail. However, you should now know enough to start exploring the ETW hierarchy that’s on your system to see what sort of tracing goodies are available to you. In future blog posts, I’ll talk more about creating ETW traces and consuming the resultant data.