Query performance, scalar UDFs, and predicate pushdown

Recently I had to troubleshoot a query that performed much slower than expected. The solution to that seemed sufficiently interesting to warrant a write-up, and the root cause of the problem reinforced a well-known best practice of avoiding scalar user-defined functions in set-based queries. The original query looked like this:SELECT  ‘Literal1’,         u.Col2FROM dbo.Table1 AS uINNER JOIN dbo.Table2 AS l ON u.Col2 = l.Col2                WHERE   l.Col3 = ‘Literal2’        AND        NOT EXISTS  (                    SELECT  1                     FROM dbo.Table2 AS l2                     WHERE   u.Col2 = l2.Col2                            AND                            Col3 = ‘Literal1’                    )        AND        dbo.UDF1(u.Col1) IS NOT NULL; Table1 had about 3 million…

11

Reading database transaction log with fn_dump_dblog()

While the format of SQL Server transaction log is not publicly documented, there is a number of ways to view the contents of the log. This is sometimes necessary for troubleshooting and forensic purposes. One way is the DBCC LOG command. Another is fn_dblog() table-valued function. Both are undocumented, however you can easily find unofficial…

2

Transaction count during DML statements

Recently I was troubleshooting a blocking problem where a number of sessions were blocked, waiting to acquire a page latch. While blocking was occurring, there were several rows in the output from sysprocesses that looked like this (only relevant columns are shown): spid   status     blocked open_tran waitresource  cmd     lastwaittype —— ———- ——- ——— ————- ——-…

5

Disjoint subtyping in SQL

Disjoint subtyping is a scenario that is often encountered in data modeling. In one frequently used modeling approach, an entity of a certain type is represented by a database table, and each subtype of this entity is represented by another table. Subtyping is disjoint if an instance of a type corresponds to at most one…

0

Surrogate keys in distributed databases

In this post, the term “distributed database” refers to a set of SQL Server databases, each managed by a SQL server running on a separate computer. All databases have identical schemas, and data that originates in one database is replicated to all other databases, or nodes. A common example would be a system with a…

3