I got a follow up question to my old post regarding enumerating rows in SSIS buffer, that suggested using following code to process rows in custom SSIS transform:
// do something with the row
Here is the question:
buffer.NextRow() moves the pointer forward, so following your code to the letter will make you skip the first row.
Does it? Actually, no. It behaves like many other enumeration interfaces, e.g. COM’s IEnumVariant and .NET’s IEnumerator, so I’ll quote documentation from IEnumerator::MoveNext:
After an enumerator is created or after the Reset method is called, an enumerator is positioned before the first element of the collection, and the first call to the method moves the enumerator over the first element of the collection.
Can you guess why it was designed this way? Let’s think what would happen, if the enumerator was positioned on the first element (row in SSIS case) initially. How would we know if the first element exists at all? The enumerator would have to provide another property, like EndOfCollection – making everything more complicated. Note that as explained in my linked post, buffer.EndOfRowset() is not such an indicator. It does not tell you that enumerator finished enumerating rows in current buffer, it tells you that current buffer is the very last buffer you will receive.
With enumerator initially positioned before first row, you call NextRow() which will immediately return false if the collection is empty, and position iterator to first row if it exists. So the code is correct.
A note regarding my original post: the SSIS team found that this change caused too many problems for the users, so the final release of SSIS 2008 reverted back to the SSIS 2005 behavior. Interestingly, it was not a simple undo of code change, as the data flow engine has been substantially rewritten, but a new code to simulate the old behavior. Thus the "wrong" code will keep working in 2008. I still recommend changing it according to my previous blog – I think my loop just looks cleaner :).