Connecting a Linked Server to Azure SQL Data Warehouse

With such a diversity of components to a data workloads, it is common for customers to use SQL Server linked servers to connect to their Azure SQL Data Warehouse. Setting this up and executing queries from SQL Server to SQL DW is pretty straight forward. I'm using SQL Server 2016 RC2 (get it here) for this sample but this has been tried on SQL Server 2012 and 2014 successfully. 

First, the limitations section:

  1. SQL Data Warehouse cannot be used to make an outgoing linked server connection

  2. SQL statements must be submitted using the linked server EXECUTE statement. Using the EXECUTE statement avoids using four-part names for objects, which is not supported by SQL DW. For example:

    Use this: 
    EXEC ( 'INSERT DemoDW.dbo.DimCustomers (CustomerId, Name) VALUES (4, ''Matt'')' ) AT [CloudDW];

    Don’t use this: 
    INSERT [CloudDw].DemoDW.dbo.DimCustomers (CustomerId, Name) VALUES (4, 'Matt')
     

  3. Other linked server functionality is not supported. For more information about using linked servers see Linking Servers on MSDN.

  4. The linked server provider must be run using the AllowInProcess option. The AllowInProcess option can be set in Management Studio by using the Properties dialog box for the provider.

AllowInProcess Setting

The first step is to ensure the AllowInProcess setting on the SQL Native Client library is set to true. This setting allows the provider to be instantiated as an in-process server. This is the default setting in SQL Server however you can enable it by running the following command:

 USE [master];
GO

EXEC master.dbo.sp_MSset_oledb_prop N'SQLNCLI11', N'AllowInProcess', 1;
GO

Creating the Linked Server

Next, we'll create the actual linked server. There are a set of options that are specific to the configuration to enable SQL Server to connect with SQL Data Warehouse. Below are the statements you'll execute to configure this correctly (again run from your SQL Server VM). Please note I have variables in place for the <server>, <database>, user (########) and password (########) that you will need to replace. You can also change the remote server name (I’ve used CLOUDDW) to a value of your choice. You can read about these options in the Linked Server Properties pages (here). 

 USE [master];
GO

EXEC master.dbo.sp_addlinkedserver @server = N'CLOUDDW', @srvproduct=N'SQLDW', @provider=N'SQLNCLI11', @datasrc=N'<server>.database.windows.net', @provstr=N'Server=<server>.database.windows.net;Database=<database>;Pooling=False', @catalog=N'<database>';
EXEC master.dbo.sp_addlinkedsrvlogin @rmtsrvname=N'CLOUDDW',@useself=N'False',@locallogin=NULL,@rmtuser=N'########',@rmtpassword='########';
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'collation compatible', @optvalue=N'false';
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'data access', @optvalue=N'true';
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'dist', @optvalue=N'false';
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'pub', @optvalue=N'false';
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'rpc', @optvalue=N'true';
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'rpc out', @optvalue=N'true';
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'sub', @optvalue=N'false';
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'connect timeout', @optvalue=N'0';
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'collation name', @optvalue=NULL;
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'lazy schema validation', @optvalue=N'true';
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'query timeout', @optvalue=N'0';
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'use remote collation', @optvalue=N'true';
EXEC master.dbo.sp_serveroption @server=N'CLOUDDW', @optname=N'remote proc transaction promotion', @optvalue=N'false';
GO

Executing Queries

Now that you've established connectivity, you can simply run some sample statements to verify connectivity. I've copied a couple of different variants below for you to try.

Note: If you change the remote server name above (from CLOUDDW to something else), you would need to change the value in the statement below to match.

 EXEC ( 'INSERT DemoDW.dbo.DimCustomers (CustomerId, Name) VALUES (1, ''Matt'')' ) AT [CloudDW];

 EXEC ( 'UPDATE DemoDW.dbo.DimCustomers SET Name = ''Matt Usher'' WHERE CustomerId = 1' ) AT [CloudDW];

 EXEC ( 'SELECT * FROM DemoDW.dbo.DimCustomers' ) AT [CloudDW];

 EXEC ( 'DELETE DemoDW.dbo.DimCustomers WHERE CustomerId = 1' ) AT [CloudDW];

Note: You can also use the OpenQuery syntax when executing queries (Thanks Sid!).

 SELECT
 *
FROM OPENQUERY(CloudDW,'SELECT * from DemoDW.dbo.DimCustomers');

Next Steps

Visit the SQL Data Warehouse Overview to learn more about Microsoft's scale out relational data warehouse.