LOB Performance in SqlDataReader.

I've been answering a few questions here and there about performance for pulling LOB data out of a SqlDataReader and figured a discussion of what's going on behind the scenes may be useful. This analysis is based primarily on the number of copies and movements of data, as well as a bit of code path inspection. Since this discussion is a bit in depth, here's the quick summary of my conclusions regarding performance for LOBs on TDS connection readers, for those who'd rather not worry too much about the details.

Non-SequentialAccess least data movement: GetString/GetValue for character types, GetSqlBinary for binary types, GetSqlXml for xml.  The entire LOB is in memory once (twice if you want a byte array instead of SqlBinary).

SequentialAccess least data movement:  GetChars() for character types, GetBytes() for binary types.  ExecuteXmlReader() for xml.  The entire LOB is never forced to be in memory all at once (although you can still cause full instantiation in your own code).
 

On to the details!  There are several different methods for pulling LOB data out, depending on the types:

  • GetChars(), GetString(), GetSqlChars() for character data.
  • GetBytes(), GetSqlBinary(), GetSqlBytes() for binary data.
  • GetSqlXml() for Xml data.
  • GetValue(), GetSqlValue() for all LOB types.

In addition using CommandBehavior.SequentialAccess can have a significant performance impact on LOBs and there are  different performance implications to using the context connection vs. a standard TDS connection.  I'll start with the normal (non-SequentialAccess) TDS case today.

First, packet data is read into a buffer, generally 8k in size (copy #1 of the data). This copy is in chunks, so does not  From there, the LOB data is assembled into it's prefered form (string for character data, SqlBinary for binary data and a simple caching of the binary data for Xml) and stored in a row buffer.  Note that this is copy #2, and all columns in the row are so copied during the Read() method.

From here, each call to obtain the data is a pretty straight-forward operation.

GetString() simply returns the string previously stored, so is very lightweight.

GetChars() converts the string to a character array (copy #3), then copies the portion of the value requested into the target buffer (copy #4).  Note that if you call GetChars() with many small chunks, the first copy (#3) is cached between calls, so you only get the last copy on subsequent calls.

GetSqlChars() converts the string to character array (copy #3), wraps the result in a SqlChars object and returns it.

GetSqlBinary(), GetBytes() and GetSqlBytes() parallel GetString(), GetChars() and GetSqlChars() respectively, only using SqlBinary in place of string and byte arrays instead of character arrays.

GetSqlXml() wraps the cached binary in a stream wrapper, then wraps the stream into a SqlXml object.  If you call GetReader() on the SqlXml instance, it instantiates an XmlReader directly on the wrapped stream and delivers nodes from there (copy #3, more or less).  XmlReader processing isn't a simple copy of data but it should be of similar magnitude, although since it produces individual Nodes, the full LOB shouldn't need to be in memory at one time (assuming you are dropping node references once you move on to the next one).

GetValue() maps to the same process as GetString() for character types and GetBytes() for binary types (creating the byte array for you in the latter case).  For xml, it is the equivalent to calling GetSqlXml(), followed by SqlXml.Value on the result.  The SqlXml.Value property converts the value to a string via an XmlTextWriter, which ends up adding another data copy (#4), this one with the entire LOB in memory.

GetSqlValue() maps to GetString() for characters, GetSqlBinary() for binary and GetSqlXml() for xml.

You might note that I haven't talked about GetSqlString(). This method in all cases just wraps a SqlString object around the results of GetString().  It adds a small amount of overhead, but nothing significant to most scenarios. 

So how does SequentialAccess affect things?  With this option, the Read() method only processes the TDS buffer (copy #1) until the first data column is reached.  At that point, control returns to the caller until one of the Get methods is called.  The reader will try to fetch values directly from the TDS buffer, but isn't always able to do so. Most of the Get methods have the same code paths for SequentialAccess as they did for non-SequentialAccess, with the step of storing values in the row buffer are delayed until the Get method itself, rather than occuring in the Read() method.  The exceptions are GetChars() and GetBytes().

GetChars() will copy the data from the TDS buffer to the target array.  Net result is only 2 copies, both of which are chunked (small chunks copied then the buffer re-used), so the full value need never be held in memory at one time.  Remember that you must get characters from the underlying value sequentially (the TDS buffer is overrwritten and the old data lost when a new packet is read).  So the field offset you pass in must be after the last character read in prior GetChars() calls on the column.

GetBytes() behaves the same way, except for targetting a byte array instead of a character array.

My last note today is about xml data.  Using getters, there is no way to avoid having a full copy of the data in memory at one time. However, if you use ExecuteXmlReader, the command will instantiate the reader over the incoming TDS buffers, rather than caching them.  The result is three data copies (TDS buffer, byte array, xml nodes), but all are chunked.

In the end, SequentialAccess doesn't really buy you less data movement for most common scenarios, but can avoid having entire LOB in memory at one time.  Don't bother using GetChars() and GetBytes()  in the non-SequentialAccess case, unless it simplifies your own code -- it won't save any time (and will actually cost you a bit) in that case.

In a later blog I'll follow up with some details for readers on context connections.