SQL Server has a few mechanisms to reach out to another server (even another server type) and query data from within a Transact-SQL statement. Among them are a set of stored credentials and information (called a Linked Server), a statement that uses a linked server called called OPENQUERY, another called OPENROWSET, and one called OPENDATASOURCE. This post isn’t about those particular functions or statements – hit the links for more if you’re new to those topics.
I’m actually more concerned about where I see these used than the particular method. In many cases, a Linked server isn’t another Relational Database Management System (RDMBS) like Oracle or DB2 (which is possible with a linked server), but another SQL Server. My concern is that linked servers are the new Data Transformation Services (DTS) from SQL Server 2000 – something that was designed for one purpose but which is being morphed into something much more.
In the case of DTS, most of us turned that feature into a full-fledged job system. What was designed as a simple data import and export system has been pressed into service doing logic, routing and timing. And of course we all know how painful it was to move off of a complex DTS system onto SQL Server Integration Services.
In the case of linked servers, what should be used as a method of running a simple query or two on another server where you have occasional connection or need a quick import of a small data set is morphing into a full federation strategy. In some cases I’ve seen a complex web of linked servers, and when credentials, names or anything else changes there are huge problems.
Now don’t get me wrong – linked servers and other forms of distributing queries is a fantastic set of tools that we have to move data around. I’m just saying that when you start having lots of workarounds and when things get really complicated, you might want to step back a little and ask if there’s a better way. Are you able to tolerate some latency? Perhaps you’re able to use Service Broker. Would you like to be platform-independent on the data source? Perhaps a middle-tier might make more sense, abstracting the queries there and sending them to the proper server. Designed properly, I’ve seen these systems scale further and be more resilient than loading up on linked servers.