SYSK 258: dr[columnIndex] or dr[“ColumnName”]? The cost of field name lookup…

Are you still debating whether the performance benefits of dr[columnIndex] outweigh the benefits of readability and maintainability of using dr[“ColumnName”] instead? Then, read on…

When using SqlDataReader class, the difference between getting data by index or column name is the cost of mapping column name to column index (the GetOrdinal function call). Omitting the code relating to getting SQL statistics and validation checking, the GetOrdinal function is quite simple: if GetOrdinal() is called for the first time, then the metadata about the retrieved data set (e.g. column names, identity columns, a flag indicating whether a column is an expression and whether it is updatable, etc.) is parsed out. Following that, a map between column names and their ordinals is created and stored as a hashtable for fast lookups thereafter.

My tests on a Toshiba Tecra M5, dual 2 Ghz processor, 2 Gb Ram on Vista RTM build show that:

  • The cost of doing the mapping described above for a resultset of 25 columns is approximately 0.0294 milliseconds (29.4 microseconds). Approximately, 0.00109 milliseconds taken up by creating the actual mapping and the rest of the time taken by parsing the metadata. Keep in mind that it only needs to be done once per resultset.
  • The cost of retrieval a data element by name is approximately 0.043 milliseconds compared to 0.027 milliseconds when retrieving by index. To put it in perspective, if you had to get all data items one by one from a recordset of 25 columns and 5,000 rows (125,000 data elements), the difference between using dr[“ColumnName”] vs. dr[columnIndex] would be about 38% additional time for GetOrdinal calls (or about 2 sec on my test hardware).