If everything is bug free in Development, Testing, User Acceptance Testing (UAT) how can it possibly go wrong in Production? The following post looks at how that can occur… Intermittently.
Ensure that the Server Names & SQL Instance names of your Test servers have the same or more characters than your Production Servers.
Recently I found an interesting bug. An EXECUTE @sql statement died with an error when the length of the TSQL commands in @SQL string was longer than ~500 chars.
The SQL code being executed included the Server Name as well as a variety of table names. The length of the name of the Production server SQL instance was 4 characters longer than any of the testing servers. As luck would have it, on the test servers, all the SQL Statements generated worked fine. But on the Production Server those extra 4 bytes when combined with the longest table name caused the statement to die with an error. The updates to the longest tables never happened, all the other tables were changed ok, so it was weeks before anyone noticed this “partial failure”.
I learnt 2 things from this:-
- Always put brackets around @sql when using EXECUTE statement, it is just more robust.
- Often in code you have a buffer of a finite size. Even if the Production server name is 1 byte longer than your Test server that might be enough to cause your command to lose the last character in the command & cause an error you failed to discover in UAT / Testing. Interesting huh?
In case you wondering if in this robust code, this error was detected & the batch immediately aborted, rolling back the batch & rising an alert. Nope. That small part of the batch failed but the rest pushed on without those rows. I can’t help wonder how many people are out there arguing with some customer service agent & are hearing “Well, thousands of our other customers got XYZ, it is unlikely that our computer just missed out on you”. 😉
I’m sure this bite you in all sorts of ways not just TSQL Dynamic Scripts.