Microsoft SQL Server Database Snapshots and Synonyms

Article
08/05/2008

OVERVIEW: One of the common complaints in using database snapshots is how to get queries and reports to switch over and start using the new snapshot as soon as it is available. There are three basic ways to accomplish this:

1. Delete the old and create a new database snapshot with the same name. This works well if no users are using either snapshot.

2. Changing the connection string. This works pretty well and has the advantage that long running reports can continue to run while new reports/queries get directed to the new database snapshot. But doesn’t work well if you have many places to change the connection string or have to deploy a new application version.

3. Use Synonyms in for the base objects to point to a database snapshot. That’s what this blog is all about.

Steps to take to implement synonyms with database snapshots:

1. Create db snapshot

2. Create synonyms in main db that point to snapshot objects

3. Sometime in the future, create another snapshot

4. Update the synonyms in a DDL transaction. Warning: Blocking can occur so read the blocking section below.

Important Notes:

· You may get some application timeouts if synonym updates get blocked for too long.

· READ_UNCOMMITTED could be problem if you implanted scenario where a query would join one table from one snapshot to a table in a different snapshot.

· Could combine synonyms with shared scalable database. This is a scenario where you could use multiple servers to attach to SAN snapshots and then use synonyms to point to the current SAN snapshot.

Potential Blocking Scenario

The DDL transaction to update all the synonyms to point to another snapshot will update system tables. This transaction will be blocked by any readers accessing the synonym table, which can cause problems. And it will block any new queries from accessing the synonym table while the transaction is running. It could mean that the application can experience timeouts if it is blocked for too long.

Trying to work around the blocking by creating the reader sessions with READ_UNCOMMITTED isolation level would probably work but is not recommended. I can imagine a scenario where a query can access a table in the new snapshot and join to a table in the old snapshot, which is the main reason why the synonyms are updated in a transaction. So the blocking is actually a good thing as long as it doesn’t cause excessive application time outs.

The DDL transaction runs quickly. On my laptop, I updated 300 synonyms in about 1 second. There are certain ERP applications out there with 10,000+ tables so this synonym idea would be a bit more painful to implement.

One clarification: Queries that are already running will continue to run. New ones are only blocked from resolving synonyms. Once they have resolved the synonym name and start reading the tables then there will not be blocking.

Sample Code:

To use this technique, we need to look at the original, or source database, three different ways; 1) the actual source database, which we’ll call SourceDB, 2) a snapshot of the source database, which we’ll call SnapshotDB, and 3) the reporting database, which we’ll call the ReportDB. The SnapshotDB represents a point-in-time version of SourceDB, and is created using SQL similar to the following:

USE SourceDB

CREATE DATABASE SourceDB_<timestamp>_Snapshot ON

( NAME = SourceDB, FILENAME =

'C:SourceDBSourceDB_<timestamp>_Snapshot.ss' )

AS SNAPSHOT OF SourceDB;

The ReportDB contains no source data, only synonyms. The synonyms point to the relevant tables and views in SnapshotDB, and are created using SQL similar to the following:

USE ReportDB

CREATE SYNONYM dbo.TableName FOR

SourceDB_<timestamp>_Snapshot.dbo.TableName

With this basic setup, all report-type queries can be executed against the ReportDB, using the same schema and object names defined in SourceDB, and the returned data will be datetime-consistent within the point-in-time source database snapshot, SnapshotDB. Additional objects, such as views, stored procedures, etc., can be created in ReportDB that refer to the synonyms, and will behave as expected.

In most cases, it will be desired to periodically drop and re-create SnapshotDB to have a more current point-in-time version of SourceDB. However, to maintain 24/7 application access while not requiring the application to use different connection strings, we cannot simply drop SnapshotDB, as queries against the synonyms in ReportDB would immediately fail, and dropping any database requires there are no open connections to it. We therefore must create a second, more current SnapshotDB, transactionally drop and re-create all the synonyms updated to point to the new SnapshotDB, then drop the old SnapshotDB. There may be queries still running against the first snapshot so you may want to comment out the DROP DATABASE command and have a separate job to do it at a later time. Beware that the more database snapshots you have active the more of a performance hit you will notice. See the whitepaper at the end.

The attached store procedure, usp_ReportingSynoSnap, implements this logic, using the appropriate metadata from SourceDB. It will create any schemas in ReportDB that exist in SourceDB, creating the corresponding synonyms in the correct schema. If access to SnapshotDB is only through synonyms in ReportDB, then there will effectively by no open connections to SnapshotDB, only to ReportDB. If those connections are active, executing queries against the synonyms, they will block the stored procedure from completing until the blocking queries finish resolving the synonym names, but once they have the stored procedure will drop the old SnapshotDB.

The stored procedure assumes the ReportDB already exists, and will only contain synonyms pointing to SnapshotDB, as it always drops all synonyms to ensure there are no orphans. The template used for naming SnapshotDB is <SourceDB> _Rpt_YYYYMMDD_HHMMSS_ss, where YYYYMMDD_HHMMSS is replaced with an actual timestamp of when the snapshot was created. Creating snapshots require enumerating all the logical data files for SourceDB, declaring a corresponding snapshot physical file for each. The stored procedure creates a snapshot physical file in the same location as the source physical file, named identically except for a suffix with the following format .Rpt_YYYMMDD_HHMMSS_ss (ex; AdventureWorks.mdf.Rpt_20080101_010101_ss), where YYYYMMDD_HHMMSS will be identical to the timestamp within the SnapshotDB name. Lastly, the stored procedure will drop the old SnapshotDB, determining the name of the old snapshot by finding a snapshot database with the same name as the new snapshot, but with the highest (in sort order) YYYYMMDD_HHMMSS. To use the stored procedure, create in either the SourceDB or ReportDB, and call with “EXEC usp_ReportingSynoSnap ‘SourceDbName’, ‘ReportDbName’”; ex: “EXEC usp_ReportingSynoSnap ‘AdventureWorks’, ‘AdventureWorks_Reporting’” (AdventureWorks_Reporting must exist before calling). If you need to keep the old snapshot around for a while, just remove or comment out the DROP DATABASE command near the end of the script.

The HHMMSS suffix in the example above may be misleading. It may take seconds or minutes depending on what is happening in the source database. And the data in the snapshot is not current as of the start or the end of the CREATE DATABASE command. All of this is explained in the whitepaper links at the end.

It should be noted that Windows will report the logical snapshot file size as identical to the corresponding source file size, however as the snapshot file is actually a sparse file, a feature of NTFS, the actual size consumed on disk will initially be zero, and will grow incrementally by 64k page sizes as data is changed in SourceDB, and SQL saves the original (before the changes are applied) SourceDB page to the snapshot (note this only happens the first time data is changed on the source page). This has two consequences, one is that there is a normally a slight write performance hit when using snapshots, and second, that if snapshots are not dropped and lots of new changes happen in the source, you may eventually end up consuming near twice the physical space actually needed for the source. Beware that there are some performance consequences that you need to be aware of when running snapshots. See the whitepapers below for more detail.

Copy the following script and run it in a query window to create the stored procedure in one of your application databases.

SOURCE CODE

SET ANSI_NULLS ON

SET QUOTED_IDENTIFIER ON

CREATE PROCEDURE [dbo].[usp_ReportingSynoSnap]

@sourceDbName NVARCHAR( MAX )

,@reportDbName NVARCHAR( MAX )

,@scriptOnly BIT = 0

BEGIN

DECLARE

@sqlStmt NVARCHAR( MAX )

,@dtStamp NVARCHAR( MAX )

,@prevRptSsName NVARCHAR( MAX )

SET @dtStamp = REPLACE( REPLACE( REPLACE( CONVERT( NVARCHAR, GETDATE(), 120 ), '-', '' ), ' ', '_' ), ':', '' )

SELECT @prevRptSsName = MAX( ssdb.name )

FROM sys.databases ssdb

Microsoft SQL Server Database Snapshots and Synonyms

Additional resources