Have you ever experienced this error in SQL Server 2005 and under the same circumstances, but using SQL Server 2000 instead, you never “suffered” this problem?
This issue was recently raised to my attention. And its resolution I found it of general interest, so I’ve just decided to publicly explain it in this forum. I’ll try to explain what has changed in SQL Server 2005 when it comes to column binding, and will also try to explain why the changes were introduced.
First of all, let’s prepare for the laboratory for our experiments:
- Install an instance of SQL Server 2005 Relational Engine.
- In that instance, create two user databases.
CREATE DATABASE DB1
CREATE DATABASE DB2
- On each database create the following tables.
CREATE TABLE DB1.dbo.test_1(name nvarchar(20), age tinyint)
CREATE TABLE DB2.dbo.test_1(name nvarchar(20), age tinyint)
CREATE TABLE DB1.dbo.test_2(name nvarchar(20), age tinyint)
CREATE TABLE DB2.dbo.test_2(name nvarchar(20), age tinyint)
- Insert a few sample rows on each table.
INSERT INTO DB1.dbo.test_1 VALUES (‘Maty’, 2)
INSERT INTO DB1.dbo.test_1 VALUES (‘Mikel’, 4)
INSERT INTO DB1.dbo.test_1 VALUES (‘Ana’, 6)
INSERT INTO DB1.dbo.test_1 VALUES (‘Diego’, 8)
INSERT INTO DB2.dbo.test_1 VALUES (‘Maty’, 2)
INSERT INTO DB2.dbo.test_1 VALUES (‘Mikel’, 4)
INSERT INTO DB2.dbo.test_1 VALUES (‘Ana’, 6)
INSERT INTO DB2.dbo.test_1 VALUES (‘Diego’, 8)
INSERT INTO DB1.dbo.test_2 VALUES (‘Maty’, 2)
INSERT INTO DB1.dbo.test_2 VALUES (‘Mikel’, 4)
INSERT INTO DB1.dbo.test_2 VALUES (‘Ana’, 6)
INSERT INTO DB1.dbo.test_2 VALUES (‘Diego’, 8)
INSERT INTO DB2.dbo.test_2 VALUES (‘Maty’, 2)
INSERT INTO DB2.dbo.test_2 VALUES (‘Mikel’, 4)
INSERT INTO DB2.dbo.test_2 VALUES (‘Ana’, 6)
INSERT INTO DB2.dbo.test_2 VALUES (‘Diego’, 8)
- Create another table test_1 but this time in the tempdb system database.
CREATE TABLE tempdb.dbo.test_1(name nvarchar(20), age tinyint)
- And also populate it with some sample rows:
INSERT INTO tempdb.dbo.test_1 VALUES (‘Maty’, 2)
INSERT INTO tempdb.dbo.test_1 VALUES (‘Mikel’, 4)
INSERT INTO tempdb.dbo.test_1 VALUES (‘Ana’, 6)
INSERT INTO tempdb.dbo.test_1 VALUES (‘Diego’, 8)
And these, the different queries he tried to run and the results of every execution:
- This query didn’t work. It failed with error 1013
UPDATE DB1.dbo.test_1 SET age = tempdb.dbo.test_1.age FROM tempdb.dbo.test_1 WHERE DB1.dbo.test_1.name = tempdb.dbo.test_1.name
- This one, which uses an alias for one of the tables used in the DML, works fine.
UPDATE DB1.dbo.test_1 SET age = tempdb.dbo.test_1.age FROM tempdb.dbo.test_1 JOIN DB1.dbo.test_1 t2 on t2.name = tempdb.dbo.test_1.name
- This works, using an alias for the table that is stored in the user database.
UPDATE DB1.dbo.test_1 SET age = tempdb.dbo.test_1.age FROM tempdb.dbo.test_1, DB1.dbo.test_1 as DB_X WHERE DB_X.name = tempdb.dbo.test_1.name
- The following also works.
UPDATE DB1.dbo.test_1 SET age = DB2.dbo.test_1.age FROM DB2.dbo.test_1, DB1.dbo.test_1 AS t1 WHERE t1.name = DB2.dbo.test_1.name
- With two tables which are objects of the same database also works, without need for an alias.
UPDATE DB1.dbo.test_1 SET age = DB1.dbo.test_2.age FROM DB1.dbo.test_1 ,DB1.dbo.test_2 WHERE DB1.dbo.test_1.name = DB1.dbo.test_2.name
So, I began by attaching a debugger to my test instance of SQL Server and enabling the debugger to stop the execution of the debuggee when an exception was thrown (sxe eh). I ran one of the queries that produced the exception, and since I had the private symbols for sqlservr.exe, from the stack I could get the name of the function from where we were throwing the 1013. (sqlservr!CNameSpace::CheckDuplicateTables). With that, I went to read the source code of that function and found that it took into account so many details and handled so many subtle variations (like behaving differently depending on the database compatibility level for the current context) that it wasn’t possible so many subtle differences would have been the result of an accident. It definitively, couldn’t be a regression.
Therefore, I decided it was a good idea to read through the functional specifications of SQL Server 2005’s algebrizer, especially the part where it describes how column binding should work. And, as already expected it was working the way it had been designed to work, as it had been written in the functional specs.
A bit of background about the column binding algorithm as defined in the SQL 92 Standard
SQL 92 describes column binding algorithm in 6.4 , “syntax rules”, and it shows subtle differences between prefixed and non-prefixed column names. Non-prefixed column names are searched first in the nearest scope and, if not found, in the next enclosing scope, etc. Until (1) the column name is found, or (2) the scopes are exhausted, or (3) an ambiguity is found (i.e. in the scope currently being searched, there are two or more columns with the name indicated – typically, they would come from different tables). (1) means the binding succeeded, (2) and (3) indicate a failure.
Note that scopes are usually created by nested subqueries.
If a prefix is specified (say “t.c”), the standard says that we should first search for the table indicated by the prefix (“t”). The search is performed in the same bottom-up manner described earlier. If the table is found, we proceed searching for the column (“c”) in that table. Only if both the table and the column are found, the binding is successful. Usually, ambiguity is impossible in this scenario, because FROM clause issues an error if the user specifies two tables with the same exposed name, and CREATE TABLE statement does not allow columns with duplicate names.
t1 has columns a, b
t2 has columns a, c
SELECT * FROM t1 as t WHERE EXISTS (SELECT * FROM t2 as t WHERE t.b = c)
t.b will not bind successfully. Table “t” will be found in the nearest scope and bound to t2. t2 does not have a column named “b”. We will not proceed to the next scope to search for a “better” t, because the table was found in the nearest scope.
Now some details about column binding in SQL Server 2000
SQL Server 2000’s notion of a “column prefix” is different from the standard. SQL Server 2000 allows “multi-part” prefix (such as dbo.t.c1). SQL Server 2000 also differs from the standard in the way prefix is matched with a table in the FROM clause.
Standard allows multi-part table names in the FROM clause (not in a column prefix). Each table, according to the standard, has an “exposed name” which is the last part of the multi-part table name (if there is no alias), or alias name when present. Column prefix matching with a table is performed as a textual match with the exposed name.
SQL Server 2000 treatment of the column prefix is different. A prefix “p” is matched with a table as follows:
- The closest scope is searched for a table with the exposed name matching “p”. In SQL Server 2000 case, the exposed name is an alias name if present, and a full (multi-part) name if there is no alias. So, for example, to match a prefix “dbo.t”, the table must be referenced as “dbo.t” in the FROM clause without an alias. “t”, “x as t”, “db.dbo.t” will not match.
Note: as a result of a comparison algorithm bug in SQL Server 2000, dot symbols themselves have no effect on matching, so “dbo.t” will successfully match with tables “dbot”, “d.b.o.t”, etc.
- If a table is not found in the closest scope, scopes are searched outwards, applying the same matching algorithm at each level.
- If a textual match is not found, the prefix is looked up in the catalogs as an object name and dbid/objid is obtained.
- The closest scope is searched again for a non-aliased table with the same dbid/objid.
- Scopes are searched outwards, this time looking for a non-aliased table with matching dbid/objid.
In many cases, the result of the above algorithm is actually the same as the standard behavior. For example, in the following statement:
SELECT t.x FROM dbo.t
prefix “t” will be successfully bound to the table dbo.t (provided that the current user is dbo), thus creating the illusion that we only look at the “exposed name” as the standard specifies. However, some differences are visible:
(1) SELECT t.x FROM db1.dbo.t
In this case, the binding will only be successful if “db1” is the current database.
(2) SELECT * FROM t WHERE EXISTS (SELECT * FROM dbo.t WHERE t.x = 5)
Here, t.x will bind to the outer t (textual match in step 2), which is in stark contrast with the standard that states that we should bind to the nearest scope based on the exposed name match.
Duplicate tables handling in SQL Server 2000
The standard specifies that the FROM clause cannot have two tables with the same exposed name. This ensures that the table binding for a prefixed column is never ambiguous. In fact, the no-duplicates condition is even stronger than is necessary just to ensure no ambiguity. Indeed, we can even allow duplicate exposed names as long as these names are never actually referenced (say, all columns from these tables are referenced in a non-prefixed form). However, the standard (and SQL Server 2000) decided to enforce this stronger condition (probably because it makes SQL cleaner and more maintainable). Since SQL Server 2000 definition of an “exposed name” is different from the standard, the definition of a “duplicate table” is correspondingly different. Essentially, a table is a duplicate of another if its full name (or an alias, if present), used as a column prefix in the above algorithm, would match the other table.
(3) SELECT t.x FROM db1.dbo.t, db2.dbo.t
The two tables in the FROM clause are not considered duplicates, because neither their full names nor dbid/objid pairs match. To what table the column “t.x” binds actually depends on the current database.
Changes introduced in SQL Server 2005
- In SQL Server 2005, we started the process of deprecating multi-part column prefixes. However, this is only the “announcement” stage, so we still have to support them in the product. Therefore there is no algorithm change.
- For single-part column prefixes, SQL Server 2005 adds the “standard matching”, i.e. a single-part prefix will match any table with the same exposed name (standard-style – alias or last name part). This fixes the example (2) above and makes matching more robust in example (1).
- Duplicate table detection algorithm has been changed correspondingly, so that any tables with the same exposed names will be considered duplicates. Example (3) above will return error.
- Dot-related comparison bug has been fixed.
- Under dbcmptlevel 80, full backward compatibility is preserved, including the dot-related comparison bug.
So this new function, only introduced in SQL Server 2005, which I mentioned earlier, is implemented as part of SQL Server 2005’s algebrizer and it checks that a new table added to a namespace is not a duplicate of some existing table. The definition of what exactly a “duplicate” is depends on the particular prefix-matching algorithm that we are using. For the standard (exposed-name-based) prefix matching, the definition of a duplicate is clear: it’s a table with the same exposed name. However, for the SQL Server 2000-style two-pass prefix matching, the definition of a duplicate is more complicated. The goal of the duplicate checking is to prevent ambiguities, so a strict definition that tables T and P are duplicates is:
There exists a column prefix X that matches both tables T and P.
This definition is not very practical because it does not specify how to find X, so we have to come up with a particular algorithm that would be equivalent to that definition. And this is what this method implements.
- The name of the new table is used as a “prefix” to look up any duplicate table.
- Failing that, and if the new table is not aliased, and it has more than a one-part name, and the compatibility level is 90, we strip off all parts except the last one and repeat lookup. This ensures that the ambiguity does not occur when we use the standard prefix matching.
- If still found nothing, and the new table is not aliased, and it is a local table or view, the lookup is repeated using the table’s dbid/objid to form an object-style prefix.
- If a match is found on the step 3 above, we have to filter out a “false match” between a special table “inserted”/”deleted” and the original table.
Hope you find it useful understanding how column binding works.