“Don’t Go Changin’…”

Another of the new features that shipped with SQL Server 2008 in the June CTP is called "Change Data Capture".

This feature is designed to be used in data warehousing / ETL scenarios where a read-only snapshot of data is maintained. The feature enables users to refresh these data warehouses by applying only the incremental changes from a previous refresh, removing the need to replace the entire snapshot of data.

It is one of a broader set of change tracking features that are coming in SQL Server 2008 that will include richer features for maintaining offline data in client applications and synchronizing changes back to the server as well as richer auditing capabilities.

Using Change Data Capture (CDC) is straightforward.

First, choose a database and table for which you want to capture changes.

Use the sys.sp_cdc_enable_db_change_data_capture system stored procedure to enable the feature for your database, and then call sys.sp_cdc_enable_table_change_data_capture to enable change capture for your source table. In my case I will track the "Contacts" table in the "TestDB" database.

USE TestDB;

GO

 

EXECUTE sys.sp_cdc_enable_db_change_data_capture;

 GO

 

 EXECUTE sys.sp_cdc_enable_table_change_data_capture 
     @source_schema = N'dbo'
   , @source_name = N'Contacts'
   , @role_name = N'cdc_Admin';
 GO

 

This will create a "Capture Instance" for your source table that includes some system maintained tables in a special "cdc" schema in your database as well as several functions you can call to retrieve the changes.

The simplest approach to querying the data is to keep the "LSN" of the last change you have applied to your copy of the data, and then pass it in to ask for all changes since then. There are CDC helper functions that make this easy. For example:

DECLARE @to_lsn binary(10)

SET @to_lsn = sys.fn_cdc_get_max_lsn();
--remember this for later calls

SELECT
*
FROM cdc.fn_cdc_get_all_changes_dbo_Contacts(@from_lsn, @to_lsn,
'all');

 

In my case the results are 2 records that show the before and after of a change to the contact's phone number.

__$start_lsn __$seqval __$operation __$update_mask ID FirstName LastName Phone EMail ZIP DOB

0x0000001B0000019D0004    0x0000001B0000019D0002    4     0x20     2    John     Doe     404-123-4567    johndoe@sample.com    30341    2003-01-02 00:00:00.000

0x0000001B000001CE0004    0x0000001B000001CE0002    4     0x08     2    John     Doe     404-123-7654    johndoe@sample.com    30341    2003-01-02 00:00:00.000

 

There are lots more aspects to this feature including controlling which columns are captured, how long history is kept, and the ability to select just the "net" changes that have happened to a table (instead of all the intermediary states). The books online included with the CTP are fairly complete, so check them out.