Managing Schema Changes (Part 1)

Question: What are best practices for managing schema changes in SQL Server 2005?  What are the fastest, least intrusive techniques for high availability?

 

Answer: A common requirement as business requirements evolve over time is managing schema changes.  While SQL Server 2005 supports the changing of schema (e.g. adding columns and constraints, changing column data types and properties, and dropping columns and constraints), in Managing Schema Changes (Part 1) we will focus on performance and concurrency issues that should be considered for large tables in typical scenarios.  In Managing Schema Changes (Part 2) we will cover additional best practice techniques to maximize performance and concurrency.

 

Let’s assume we have a Billion row table which is quite common these days.   Schema changes have to be completed in the fastest, least intrusive manner with high availability. 

Some schema changes include implicit reads and writes.  These operations, extending the duration of the schema change transaction, can reduce concurrency on large tables by taking shared locks for reads and exclusive locks for writes.  For example, adding a CHECK constraint requires a scan to enforce the CHECK constraint.   Implicit updates are performed for schema changes involving NOT NULLS, base data type changes or column length changes.  These implicit reads and writes can be seen either via Profiler statement level tracing or by using SET STATISTICS IO ON. 

 

To trace these extra operations in Profiler, select the following

  1. TSQL event, select SQL:StmtStarting and SQL:StmtCompleted.
  2. Stored Procedure event, select SP:StmtStarting and SP:StmtCompleted

 

Adding columns – alter table MyTable add NewColumn……

#

ColumnProperty

Column Constraints

Profiler Observations (StmtStarting & StmtCompleted)

Performance & Concurrency impact

1

NULL

N/A

alter table MyTable

add MySSN numeric(9,0) NULL

Fastest.

Best performance and concurrency.

Metadata only change.

2

NULL

DEFAULT

alter table MyTable

add MyDate2 datetime NULL

default getdate()

Fastest.

Best performance and concurrency

Metadata only change.

3

NULL

CHECK

alter table MyTable

add MyPct3 numeric(4,1) NULL constraint MyChk3

check (MyPct3 between 0 and 100)

Samples stats, then reads entire table to enforce check constraint.

Faster than NOT NULL, but does do reads for check constraint.

In large tables, the read activity for the check constraint could cause blocking for writes.

4

NULL

CHECK, DEFAULT

alter table MyTable

add MyInt4 int NULL default 0

constraint myChk4

check (MyInt4 between 0 and 100)

Samples stats, then reads entire table to enforce check constraint.

Faster than NOT NULL, but does do reads for check constraint.

In large tables, the read activity for the check constraint could cause block writes.

5

NULL

DEFAULT

WITH VALUES

alter table MyTable

add MyInt5 int NULL default 0

with values

Updates entire table with default value

UPDATE [Northwind].[dbo].[MyTable]

SET [MyInt5] = DEFAULT

Slower than NULL & DEFAULT (see #2) because the WITH VALUES clause does an update of all rows in existing table.

In large tables, the batch update could cause concurrency issues .

6

NULL