File IO Captures


File IO captures help you identify inefficiencies in your title’s disk io patterns.  Captures also include an analysis that helps you create a more optimal package layout.

Initiating a File IO Capture

To start a capture, click the File IO Trace button on your device connection tab.

file_io_capture_start

When the capture starts running, you’ll see a dialog that lets you either stop or cancel the capture.  File IO captures can run for relatively long periods of time.  It’s common to use file io captures to profile a level load, or a play through of a significant part of your game, for example.

memory_stop

File IO captures are ETW-based.  When you press the Stop button, the collection of ETW data will stop and PIX will open the capture.
File IO captures open to a tab called a landing page.  This initial page provides a textual description of the rest of the tabs in the capture.

The first tab you’ll likely want to look at after the landing page is the Capture Summary tab.

The Capture Summary Tab

The capture summary provides some basic statistics about the files that were accessed during the capture.  These statistics include the number of files, the amount of data that was accessed and the average throughput and disk utilization.

Full callstacks are also provided for the Top 10 redundant reads.  The Top 10 list is ordered by a cost which is defined as the number of bytes read multiplied by the read count.  Each entry in the callstack includes a hyperlink to the source file for that function (if PIX could find the PDB).  Clicking on the hyperlink will open the source file in the default editor registered for the file extension. To open the source file in Visual Studio, right click on the hyperlink and choose “Open In VS”.

file_io_summary

The File IO Tab

The timeline tab provides data on all file accesses made by your title as the capture ran.  The event list view has one row for every file access, ordered by time the access was initiated.  By default, the events list includes columns that describe details such as the size, offset, start time and duration of the access.  Click the Counters button to change the set of columns that are displayed.

The event list also contains a Notes column that the profiler uses to point out potential inefficiencies in your file access patterns.  The most common non-blank value you’ll see in the Notes column is “Reverse Seek” indicating that the offset of the current access is less than the offset of the previous access. In other words, you’ve read backwards on the disk.
Selecting a row in the events list populates the Callstack view with the function name, source file and line number describing the location in your title that initiated the file access.
The bottom view in this tab is a timeline that shows the set of outstanding file IO operations at any point in time. The timeline includes a graph of IO throughput and disk utilization. The timeline and the events list are synchronized. If you select an IO event in the list, the corresponding section of the timeline will be highlighted, and vice versa.
The timeline uses colors to show stages of an IO operation.  Orange bars indicate the time when an IO operation is queued and purple bars indicate the time in which an operation is actually being serviced.

file_io_timeline

The Usage Tab

The Usage tab shows your accesses grouped by file. Presenting the data in this way makes it easy for you to see which files were read (and not read) during the capture. When compared with the files in a given section of your package layout, the data on the Usage tab can be used to adjust the files contained in that section. For example, you can use the file io profiler to determine which files you read during startup, then compare that set with the files contained in your launch section. If there are files contained in your launch section that don’t appear in the capture, consider removing those files from the section in order to minimize the amount of data that must be installed or downloaded before your title can begin running.

The Usage tab also shows data about the accesses within each file. This data includes:

  • The percentage of each file that was read. This percentage is displayed in the % Accessed column. Large files for which only a small percentage is read are candidates for breaking into smaller files in order to reduce chunk size, if appropriate.
  • The number of accesses for each offset within the file. This data can be seen using a combination of the File Offset and Access Count columns. A high access count indicates that you’re reading the same data from the file multiple times. More efficient file io performance can be obtained by optimizing your title to read data at a particular offset only once and caching the data in memory if possible.

file_io_usage

Skip to main content