Understanding data security in cloned databases created using DBCC CLONEDATABASE

DBCC CLONEDATABASE feature was first introduced in SQL Server with SQL Server 2014 SP2 and was later added to SQL Server 2016 with SP1. The primary design goal for DBCC CLONEDATABASE which the SQL Product team had in mind is to provide mechanism to create fast, minimally invasive and transaction ally consistent database clones, useful for query tuning. Database schema, statistics and query store are commonly required data for query tuning or troubleshooting sub optimal query plans and plan regressions. To make database cloning fast, minimally invasive and consistent, the copying of metadata objects is performed at the storage engine layer by taking a transient snapshot of the production database. Database cloning have proved to be significantly useful in reducing the troubleshooting time for dbas, developers and Microsoft CSS by extracting only the data required for troubleshooting from the production databases. In addition, cloning a database also help minimize the risk of providing access to production databases or sharing business data directly with developers or support teams. Although user tables & indexes data is not copied in the cloned database, user data is still available and exposed in cloned database via statistics and query store. As the primary scenario for dbcc clonedatabase is troubleshooting, the default database clone contains the copy of schema, statistics and query store data from source database. Query store data is contained only in SQL Server 2016 instances provided if query store was turned ON in source database prior to running DBCC CLONEDATABASE.

Note: To copy the latest runtime statistics as part of Query Store, you need to execute sp_query_store_flush_db to flush the runtime statistics to the query store before executing DBCC CLONEDATABASE.

--Default database clone with target database clone contains the schema, statistics and query store data copy from source database DBCC CLONEDATABASE (source_database_name, target_database_name)

Another scenario where cloned databases are useful is in source depot and schema compare of production database schema with dev schema. In this scenario, only copy of production database schema is desired in the database clone to compare it with that in dev environment. For some businesses, especially in healthcare, finance, data privacy is critical and no user data (including statistics and query store) can be shared with developers, vendors or support teams. For this scenario, the following syntax introduced in SQL Server 2016 SP1 can be used to allow users to create schema only database clones with no user data.

-- Creates Schema only database clone with no user data DBCC CLONEDATABASE (source_database_name, target_database_name) WITH NO_STATISTICS, NO_QUERYSTORE

There are also scenarios where DBAs are required to share the schema and statistics database clones with developers, vendors or support teams for troubleshooting but some tables or columns within the source database contains business sensitive data (for e.g. SSN or creditcard columns) which cannot be shared with anyone. Currently, DBCC CLONEDATABASE doesn't support selectively including or excluding objects from the source database in the cloned database. If your requirement falls in this category, you can use any of the following techniques described below to protect data in cloned databases before it is shared with anyone.

Drop statistics on tables or columns containing sensitive business data in database clone

I have uploaded a TSQL stored procedure script in our Tiger Github repository which can be used to drop all the statistics from the specified table or column on the table. You can download and run the script against the cloned database. The stored procedure needs to be executed for each table or column containing sensitive data whose stats you would like to purge. The script purges the user as well as index statistics including indexes on primary constraints, however if there are any foreign key references, it should be dropped manually.

If you would like to enhance or improvise the script, feel free to send a pull request on github for the benefit of SQL Community

-- Create a database clone with no query store
DBCC CLONEDATABASE('AdventureWorks2014','AdventureWorks2014_Clone') WITH NO_QUERYSTORE

-- set the cloned database in read write mode ALTER DATABASE AdventureWorks2014_Clone SET READ_WRITE

-- create the stored procedure usp_DropTableColStatistics in cloned database USE AdventureWorks2014_Clone
GO

create procedure usp_DropTableColStatistics -- copy script from here

-- Drops all the statistics on column CardNumber on table Sales.CreditCard exec usp_DropTableColStatistics 'Sales.Creditcard','CardNumber'

-- iterate again for other tables -- If no column name is specified and only table name is specified, all the statistics on that table is dropped -- Drop all the statistics on table Sales.CreditCard

exec usp_DropTableColStatistics 'Sales.Creditcard'

-- Backup database clone with compression BACKUP DATABASE AdventureWorks2014_Clone TO DISK = 'c:\backup\clonedb.bak' WITH COMPRESSION

-- DROP CLONED Database post backup DROP DATABASE DATABASE AdventureWorks2014_Clone

Note: The newly generated database generated from DBCC CLONEDATABASE isn't supported to be used as a production database and is primarily intended for troubleshooting and diagnostic purposes. We recommend detaching the cloned database after the database is created.

Column-level Encryption

If the columns in the source database is encrypted by using column level encryption, the statistics inside the source database are also encrypted which also ensures statistics copied in cloned database is encrypted. The following script validates that behavior

USE [AdventureWorks2014]
GO

CREATE MASTER KEY ENCRYPTION BY PASSWORD='$ecretP@ssw0rd';

CREATE SYMMETRIC KEY TestSymKey
WITH ALGORITHM = TRIPLE_DES ENCRYPTION BY PASSWORD = '$ecretP@ssw0rd';

OPEN SYMMETRIC KEY TestSymKey DECRYPTION BY PASSWORD = '$ecretP@ssw0rd';

-- added a column to encrypt CreditCardNumber ALTER TABLE [Sales].[CreditCard] ADD CreditCardNumber varbinary(max)

-- Updating the new column and encrypting it by symmetric key UPDATE [Sales].[CreditCard] SET CreditCardNumber = ENCRYPTBYKEY(KEY_GUID('TestSymKey'),CardNumber)

--creating statistics on encrypted columns CREATE STATISTICS encryptedcreditcardno ON [Sales].[CreditCard](CreditCardNumber)

-- Validate if statistics are encrypted DBCC SHOW_STATISTICS("Sales.CreditCard",encryptedcreditcardno)

-- Creating a database clone with no query store DBCC CLONEDATABASE('AdventureWorks2014','AdventureWorks2014_Clone') WITH NO_QUERYSTORE

USE [AdventureWorks2014_Clone]
GO
DBCC SHOW_STATISTICS("Sales.CreditCard",encryptedcreditcardno)

In addition to encrypted columns and statistics, if there are other encrypted objects like stored procedure, function etc in the source database, it will be copied in the database clone but the execution of the stored procedure will fail since encrypted objects is not supported in database clones.

Always Encrypted Columns

DBCC CLONEDATABASE currently doesn't support Always encrypted objects. Thus, if the columns in the source database is encrypted using Always Encrypted encryption, DBCC CLONEDATABASE will exclude those objects present in the source database.

Note: There is a known issue where if the source database contains always encrypted objects, running DBCC CLONEDATABASE against the database results into a AV causing the client session to terminate. We will be fixing the issue in upcoming CUs for SQL Server 2016. The fix for the issue will avoid AV while creating database clone. by excluding the metadata and data for always encrypted objects.

Transparent Data Encryption (TDE)

If you use TDE to encrypt data at rest on the source database, DBCC CLONEDATABASE supports cloning of the source database but the cloned database is not encrypted by TDE. Thus, the backup of the cloned database will be unencrypted. If it is desired to encrypt and protect cloned database backup, you can enable TDE on cloned database before it is backed up as shown below

USE master; GO

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<UseStrongPasswordHere>';
go

CREATE CERTIFICATE MyServerCert WITH SUBJECT = 'My DEK Certificate'; go

BACKUP CERTIFICATE MyServerCert TO FILE = 'MyServerCert' WITH PRIVATE KEY ( FILE = 'SQLPrivateKeyFile',
ENCRYPTION BY PASSWORD = '*rt@40(FL&dasl1'
); GO

-- Create a database clone with no query store

DBCC CLONEDATABASE('AdventureWorks2014','AdventureWorks2014_Clone') WITH NO_QUERYSTORE

-- set the cloned database in read write mode ALTER DATABASE AdventureWorks2014_Clone SET READ_WRITE

USE AdventureWorks2014_Clone; GO

CREATE DATABASE ENCRYPTION KEY WITH ALGORITHM = AES_128 ENCRYPTION BY SERVER CERTIFICATE MyServerCert; GO

ALTER DATABASE AdventureWorks2014_Clone SET ENCRYPTION ON; GO

BACKUP DATABASE AdventureWorks2014_Clone TO DISK = 'c:\backup\clonedb.bak' WITH STATS=5
GO

-- DROP DATABASE CLONE  DROP DATABASE DATABASE AdventureWorks2014_Clone

Hope the above article helps you understand data security and provide guidance on protecting user data with databases cloned with DBCC CLONEDATABASE.

 

Parikshit Savjani
Senior PM, SQL Server Tiger Team
Twitter | LinkedIn
Follow us on Twitter: @mssqltiger | Team Blog: Aka.ms/sqlserverteam