Reviewed by: Rajesh Setlem, Mike Weiner, Xiaochen Wu
With thanks to Joey D’Antoni (blog) for asking a great question that prompted this article.
As any experienced DBA knows, supporting a very large database (VLDB) tends to be exponentially more complex than supporting smaller databases. The definition of VLDB has been shifting in recent years, but most DBAs would agree that complexities become significant starting with database size in the 3-5 TB range. A few obvious examples of tasks that are more challenging with a VLDB are backup/restore, consistency checking, HA/DR, partitioning, etc.
While these challenges are inherently present in any VLDB, the platform used for hosting the database can sometimes make them easier. In this article, we will show how using Azure as the cloud platform for SQL Server can simplify some of these common challenges.
To be clear from the start, we will be talking about databases in the 3-30 TB range running on Azure IaaS VMs. At those sizes, the traditional database operation methods tend to be inefficient, or sometimes break down completely. For smaller databases, the approaches described in this article may be unnecessary (though they would still work).
A VLDB database
In the examples in this article, we will be using a 12 TB database, appropriately named VLDB01. The database was created on a SQL Server VM in Azure by loading data from text files in Azure Blob Storage, using PolyBase. The database was created with files stored directly in Azure blob storage, using a Premium Storage account.
The ability to store database files directly in blob storage, rather than on disk, is what enables the scenarios described in this article. Starting with SQL Server 2014, in addition to using files stored on a local disk or on a UNC path, SQL Server also supports placing database files as page blobs in Azure Blob Storage, and accessing them over an http(s) endpoint. This lets SQL Server use certain features of Azure Blob Storage that are not necessarily available in the more traditional storage subsystems. One example of this is described in Instant Log Initialization for SQL Server in Azure. This article will cover additional scenarios, with emphasis on simplifying operational tasks for a VLDB.
Database file layout for the VLDB01 database is described by the output from the following query:
SELECT name, physical_name, SUM(CAST(FILEPROPERTY(name, 'SpaceUsed') AS bigint) * 8192.0/1024/1024/1024/1024) AS space_used_TB, SUM(CAST((size/128.0) AS decimal(15,2))/1024/1024) AS file_size_TB FROM sys.database_files GROUP BY GROUPING SETS ((name, physical_name),());
Each data file, other than the PRIMARY filegroup data file and the log file, uses a 2 TB Premium Storage page blob, the equivalent of a P40 disk. This provides maximum per-blob storage throughput and IOPS available in Azure Premium Storage today. The number of data files, eight for this particular database, was chosen to allow the database to grow by another ~4 TB without any changes in the storage layout. Even if the data could fit into fewer data files, using more files lets us increase the overall storage throughput and IOPS for the database (subject to VM level limits; more on this in the Limitations section below).
Backup and restore
Let’s see how long it would take to back up this 12 TB database using the traditional streaming backup to URL. Since each SQL Server backup blob cannot exceed ~195 GB (see Backing up a VLDB to Azure Blob Storage for details), we have to stripe the backup over multiple blobs. In this case, we are using the maximum allowed number of stripes, 64, to ensure we do not hit the 195 GB per-stripe limit. That said, for a 12 TB database, we could use fewer stripes, assuming reasonably high backup compression ratio. Here is the abbreviated BACKUP DATABASE command (omitting most stripes), with parameters optimized for backup to URL as described in Backing up a VLDB to Azure Blob Storage:
BACKUP DATABASE VLDB01 TO URL = 'https://storageaccount.blob.core.windows.net/backup/vldb01_stripe1.bak', ... URL = 'https://storageaccount.blob.core.windows.net/backup/vldb01_stripe64.bak' WITH COMPRESSION, CHECKSUM, MAXTRANSFERSIZE = 4194304, BLOCKSIZE = 65536, STATS = 1;
On a GS5 VM (32 cores, 448 GB RAM, 20000 Mbps network bandwidth), this took nearly 17 hours:
... 99 percent processed. Processed 848 pages for database 'VLDB01', file 'VLDB01' on file 1. Processed 206278432 pages for database 'VLDB01', file 'VLDB01_File2' on file 1. Processed 206014904 pages for database 'VLDB01', file 'VLDB01_File1' on file 1. Processed 205881752 pages for database 'VLDB01', file 'VLDB01_File3' on file 1. Processed 206835160 pages for database 'VLDB01', file 'VLDB01_File4' on file 1. Processed 206409496 pages for database 'VLDB01', file 'VLDB01_File5' on file 1. Processed 206307512 pages for database 'VLDB01', file 'VLDB01_File6' on file 1. Processed 206213568 pages for database 'VLDB01', file 'VLDB01_File7' on file 1. Processed 205815696 pages for database 'VLDB01', file 'VLDB01_File8' on file 1. 100 percent processed. Processed 2 pages for database 'VLDB01', file 'VLDB01_log' on file 1. BACKUP DATABASE successfully processed 1649757370 pages in 60288.832 seconds (213.783 MB/sec).
While for some applications this might be acceptable (after all, this is a potentially infrequent full backup; more frequent differential and transaction log backups would take less time), a restore of this database would take at least as long as the full backup took, i.e. close to a day. There are few applications that can accept a downtime that long.
Let’s see what would happen if instead of a streaming backup, we use a file-snapshot backup, available if database files are stored directly in Azure Blob Storage.
BACKUP DATABASE VLDB01 TO URL = 'https://standardstorageaccount.blob.core.windows.net/backup/VLDB01_snapshot1.bak' WITH FILE_SNAPSHOT;
Processed 0 pages for database 'VLDB01', file 'VLDB01' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File1' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File2' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File3' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File4' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File5' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File6' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File7' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File8' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_log' on file 1. BACKUP DATABASE successfully processed 0 pages in 0.151 seconds (0.000 MB/sec).
This backup completed in 16 seconds! This is possible because no data movement has actually occurred, unlike in the streaming backup case. Instead, a storage snapshot was created for each database file, with SQL Server ensuring backup consistency for the entire database by briefly freezing write IO on the database. The backup file referenced in the BACKUP DATABASE statement above (VLDB01_snapshot1.bak) stores pointers to these snapshots, and other backup metadata.
We should note that the database was idle during this backup. When taking the same backup on the same GS5 VM running queries with 60% CPU utilization, and reading ~1 GB/s from database files in blob storage, the backup took 1 minute 29 seconds. The backup took 1 minute 55 seconds when a bulk load into a table was running. But even though file-snapshot backups get slower on a busy instance, they are still orders of magnitude faster than streaming backups.
Let’s try restoring this backup on another SQL Server instance on a different VM:
RESTORE DATABASE VLDB01 FROM URL = 'https://standardstorageaccount.blob.core.windows.net/backup/VLDB01_snapshot1.bak' WITH MOVE 'VLDB01' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01.mdf', MOVE 'VLDB01_File1' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File1.ndf', MOVE 'VLDB01_File2' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File2.ndf', MOVE 'VLDB01_File3' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File3.ndf', MOVE 'VLDB01_File4' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File4.ndf', MOVE 'VLDB01_File5' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File5.ndf', MOVE 'VLDB01_File6' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File6.ndf', MOVE 'VLDB01_File7' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File7.ndf', MOVE 'VLDB01_File8' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File8.ndf', MOVE 'VLDB01_log' TO 'https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_log.ldf', RECOVERY, REPLACE;
Processed 0 pages for database 'VLDB01', file 'VLDB01' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File1' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File2' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File3' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File4' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File5' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File6' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File7' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_File8' on file 1. Processed 0 pages for database 'VLDB01', file 'VLDB01_log' on file 1. RESTORE DATABASE successfully processed 0 pages in 1.530 seconds (0.000 MB/sec).
This restore completed in 8 seconds on an idle instance. For restore, concurrent load does not affect timing as much as it does for backup. On a busy instance with the same CPU and storage utilization as above, restore completed in 10 seconds. During restore, no data movement is required if the same storage account is used for the restored database, even though a new set of database files (blobs) is created. Azure Blob Storage ensures that the initial content of these new blobs is identical to the content of the backup snapshots.
You might notice that the output from the backup and restore commands above looks a little unusual, with the page and throughput statistics reported as zeroes. This is because SQL Server does not have them. The backup and restore are actually performed at the Azure Blob Storage level, with SQL Server calling blob storage API to create storage snapshots of all database files as a part of backup, or create blobs from snapshots as a part of restore. You may also notice that the backup and restore times in the output are much shorter compared to the actual duration. This is because they are reported for a specific phase (data copy) of the overall backup/restore process, which for file-snapshot backup/restore is very quick, compared to the overall duration of the operation. See this blog by Nacho Alonso Portillo for details.
These examples show that for a VLDB with files stored directly in Azure Blob Storage, backup/restore scenarios can be greatly simplified by using file-snapshot backups, in much the same way as SAN storage snapshots, if used correctly, simplify these scenarios for large on-premises databases.
What about another task that is challenging with a VLDB, namely consistency checking (DBCC CHECKDB)? For a multi-TB database, the traditional on-premises solutions usually involve some combination of limited (PHYSICAL_ONLY) and/or partial (at the table/filegroup level) checks. This compromise is often required to complete the check within the available maintenance window, and without causing excessive resource contention between the check and the regular database workload. Sometimes, the check is done only on a replica (if using availability groups), or on a restored copy of the database. Neither is ideal: using an AG replica increases resource utilization on the replica, potentially affecting other workloads running on the replica and AG replication itself (not to mention that corruption could exist on some replicas but not on others). Checking a copy of the database restored on a separate instance requires a lengthy (hours, days) restore process, implying that corruption on the source database could potentially occur while this restore is in progress, and remain undetected until the next check.
How does this look in Azure? We have just shown that the entire backup/restore process for a 12 TB database with files in Azure blob storage completed in a couple of minutes, even on a busy instance. This provides an elegant solution for checking consistency of a VLDB in Azure: restore a file-snapshot backup on a separate VM, and run consistency check against the restored copy of the database. This VM can be stopped most of the time to save costs, and only started periodically to run consistency checks. The size of this VM can be different from the size of the VM hosting the database, to strike the right balance between the VM cost and the duration of the check, which is determined by available compute resources. Most importantly, the checks will be done on a “fresh” copy of the database, minimizing the likelihood of corruption happening on the source database while the copy is being created.
Reporting and analytics
If we take a step back, we will realize that with file-snapshot backups, we effectively have a way to create a copy of the database in near real-time, within minutes or even seconds. This opens a possibility to use such a copy (or multiple copies, in a scale-out fashion) for reporting and analytics, to offload the part of these workloads that can tolerate some data latency from the primary database. Other scenarios may be possible as well, e.g. quickly creating an “as-of” copy of the VLDB for research and audit purposes.
As promising as all this may look, before everyone starts deploying VLDBs with files stored directly in blob storage, we must describe the limitations.
1. Each Premium Storage account is limited to 35 TB, and all files for a database must reside in the same storage account. Using Standard storage with its 500 TB per-account limit, while supported, is usually not recommended for performance reasons. For the restore from file-snapshot backup to be fast, the files of the restored database must reside in the same storage account as the files of the source database. Restoring to a different storage account is possible, but requires a size-of-data operation in blob storage, which for a VLDB could take hours or days. This implies that for practical purposes, there is a limited number of database copies that can be created using this approach. For example, for a 12 TB database used in the examples above, we can create at most one other copy in the same storage account via file-snapshot backup/restore, and for databases larger than ~17 TB, no additional copies may be created. Attempting to restore a file-snapshot backup when there is not enough space in the storage account results in an error:
Msg 5120, Level 16, State 152, Line 127 Unable to open the physical file "https://premiumstorageaccount.blob.core.windows.net/mssqldata/VLDB01_File3.ndf". Operating system error 1450: "1450(Insufficient system resources exist to complete the requested service.)". Msg 3013, Level 16, State 1, Line 127 RESTORE DATABASE is terminating abnormally.
The mitigating factor here is that the snapshots taken for file-snapshot backups are not counted towards the 35 TB limit. As mentioned in the Scalability and Performance section of Premium Storage documentation, the snapshot capacity of a Premium Storage account is 10 TB. The extent to which this additional space is used depends on the volume of changes in the database between each pair of consecutive file-snapshot backups. This is analogous to the way space is consumed by sparse files used in SQL Server database snapshots (not to be confused with Azure Blob Storage snapshots used in file-snapshot backups).
2. Since we create database files directly in a storage account, we cannot take advantage of the Managed Disks capabilities. In particular, for databases using availability groups, we cannot guarantee that the different storage accounts behind each availability replica are on different storage scale units. Using the same storage scale unit for more than one replica creates a single point of failure. While the likelihood of storage scale unit failure is not high, there are known cases when database availability had been negatively affected due to this, before Managed Disks became available to avoid this issue.
3. Because database files are accessed over http, disk IO becomes a part of network traffic for all intents and purposes. This means that unlike with Azure VM disks, there is no local SSD cache to potentially improve storage performance. Additionally, the same network bandwidth is shared between the VM network traffic, and the database storage traffic, which can affect network intensive applications. Further, the VM limits on disk throughput and IOPS that are published in Azure VM documentation do not apply to SQL Server database files stored directly in blob storage, though the network bandwidth limits are still applicable.
4. There are additional considerations and limitations to consider, documented in File-Snapshot Backups for Database Files in Azure.
That said, even with these limitations, managing a SQL Server VLDB in Azure can be greatly simplified if files are stored directly in blob storage, allowing the use of techniques described in this article. We strongly recommend considering this option for your next project involving a SQL Server VLDB in Azure.