Lookup multiple rows?

Can SSIS Lookup do what this user wants it to do?

I have a problem with a lookup output, I get this warning: The Lookup transformation encountered duplicate reference key values when caching reference data. I know what it is, but I don’t like to avoid this warning, I’d like to get all the rows (two in this case) that the lookup output provides me.

Unfortunately, no – the reason is that Lookup transform is synchronous, i.e. it does not add new rows or remove rows*, it just modifies the values – i.e. it can’t produce two output rows for one input row.

It would of course be possible to make an asynchronous Lookup, or provide an option, but the current Lookup is complex enough, that I think more options would kill it 🙂

If you need this functionality, you can use Merge Join transform.

*What happens with the rows that are redirected to “not found” output in SSIS 2008? They are not deleted from the buffer (synchronous transform can’t do it), they are just marked as belonging to the other path, and the components on the main path do not see them.

Comments (3)

  1. Rafael Salas says:

    Hi Michael,

    I think it would helpful to have the ability of telling the transformation which of the duplicate rows should be chosen (first, last, any or error out). I blogged about that request here:




  2. Frank says:

    So which duplicate record does it pick?  I know people are requesting the option to choose first, last, etc… but what is happening now in the 2005 version of SSIS?

  3. michen says:

    > So which duplicate record does it pick?

    I believe it picks up the first record Lookup sees. The order is of course determined by SQL statement – i.e. it is random unless you add ORDER BY. And when you add ORDER BY, you can easily change the sort order and make it the one you want (i.e. first, last, anything else based on your criteria – just sort the rows using appropriate SQL statement).