SYSK 258: dr[columnIndex] or dr[“ColumnName”]? The cost of field name lookup…


Are you still debating whether the performance benefits of dr[columnIndex] outweigh the benefits of readability and maintainability of using dr[“ColumnName”] instead?  Then, read on…


 


When using SqlDataReader class, the difference between getting data by index or column name is the cost of mapping column name to column index (the GetOrdinal function call).  Omitting the code relating to getting SQL statistics and validation checking, the GetOrdinal function is quite simple: if GetOrdinal() is called for the first time, then the metadata about the retrieved data set (e.g. column names, identity columns, a flag indicating whether a column is an expression and whether it is updatable, etc.) is parsed out.  Following that, a map between column names and their ordinals is created and stored as a hashtable for fast lookups thereafter.


 


My tests on a Toshiba Tecra M5, dual 2 Ghz processor, 2 Gb Ram on Vista RTM build show that:



  • The cost of doing the mapping described above for a resultset of 25 columns is approximately 0.0294 milliseconds (29.4 microseconds).  Approximately, 0.00109 milliseconds taken up by creating the actual mapping and the rest of the time taken by parsing the metadata.  Keep in mind that it only needs to be done once per resultset.

  • The cost of retrieval a data element by name is approximately 0.043 milliseconds compared to 0.027 milliseconds when retrieving by index. To put it in perspective, if you had to get all data items one by one from a recordset of 25 columns and 5,000 rows (125,000 data elements), the difference between using dr[“ColumnName”] vs. dr[columnIndex] would be about 38% additional time for GetOrdinal calls (or about 2 sec on my test hardware).

 

Comments (8)

  1. nativecpp says:

    Thanks for the info. In fact, I went to MSDN about GetOrdinal(http://msdn2.microsoft.com/en-us/library/system.data.sqlclient.sqldatareader.getordinal.aspx),

    the remarks states:

    "GetOrdinal performs a case-sensitive lookup first. If it fails, a second, case-insensitive search occurs.

    Does that means it has two entries in the hash table for each column ?

    Thanks

  2. irenake says:

    As I understand it, it first tries to get the index from name using Hashtable (i.e. case-sensitive lookup).  If it doesn’t find it, it loops through all field names, one by one, doing string case-insensitive comparison.

  3. nativecpp says:

    I see. Any reason why not tolower to key so that it needs *NOT* loop ??

  4. irenake says:

    Because there is no guarantee that it is in the lower case. E.g. if your key is KeY1 then it’ll be stored as such in the hashtable.  If you attemp to retrieve it as KEY1 or key1, it won’t find that entry…  The only way to get it is to loop and do something like: string.Compare("KeY1", "Key1", true) == 0

  5. nativecpp says:

    What I am saying is that the key (i.e. column name) to the hash is always, say, lowercase.

    Therefore, when col[‘KEY1’] is reference, BCL convert ‘KEY1’ to lowercase and therefore would be able to locate in the hash table.

  6. Hello! Great site! I’ve found a lot information here. I don’t know how to thank you. I hope you’ll be writing more and more.

    Thank you again. Bye.

  7. Sean says:

    the difference between using dr[“ColumnName”] vs. dr[columnIndex] would be about 38% additional time for GetOrdinal calls

    Don’t you mean that the GetOrdinial calls are 38% quicker ?

    The cost of retrieval a data element by name is approximately 0.043 milliseconds compared to 0.027 milliseconds when retrieving by index.

  8. irenake says:

    GetOrdinal is an additional method that is executed to convert column name to index…  

Skip to main content