Best Performer: Distributed query (Four-part) or OPENQUERY when executing linked server queries in SQL Server

I wanted to share this information to all those who are searching for better performer when it comes to distributed query vs openquery for running linked server queries.

Distributed Query : Linked server four part queries are also called distributed queries. Using distributed queries,  you can refer tables on different data sources/servers in a single query. Query optimizer creates an execution plan by looking at the query nomenclature and breaks it into remote and local queries. Local queries are executed locally and data for remote queries are collected from the remote servers, scrubbed locally, combined together and presented to end user as single record set. 

OpenQuery : Executes the specified pass-through query on the specified linked server. SQL Server sends pass-through queries as un-interpreted query strings to an OLE DB data source . That is, SQL won’t apply any kind of logic on the query and won’t try to estimate what that query would do, it would simply pass the specified query as it is to the target linked server. Open queries are useful when you are not referencing multiple servers in one query. It’s generally fast as SQL does not  break it into multiple operations and does not perform any local action on the output received.

 So which is faster Distributed Query or Open query and why?

The answer is, generally, OPENQUERY would be faster but distributed queries could be as fast too.

For example lets’ say I have linked server between two SQL instances SQL1 and SQL2. And I have to do select count(*) on emp table in test database on remote server SQL2.

Distributed query would be something like SELECT count(*) FROM [SQL2].[test].[dbo].[emp]

OPENQUERY would be  SELECT * from OPENQUERY([SQL2], 'SELECT count(*) FROM [test].[dbo].[emp]')

If you look at the execution plan by running SET STATISTICS PROFILE ON, you can see that for executing distributed query, SQL1 sends a request to SQL2 to send the statistics information for table emp in the database test. Please note that the user account running this distributed query must have certain permissions in the remote server as documented in to be able to gather data distribution statistics from the remote server else SQL Server might generate less efficient query plan and you will experience poor performance. UPDATE: This is going to be fixed in SQL Server 2012 so that if you have SELECT permission over columns of the index/statistics AND have SELECT permission over all columns of the FILTER (WHERE/HAVING) clause in the target server, you will be able to get the histogram. Important: The target or remote server has to have the SQL Server 2012 version with fix applied.

We have seen issues where too many connections running distributed queries end up in a SOSHOST_MUTEX wait while SQL Server is collecting data distribution statistics from the remote server. Also it has to be noted that a single query makes connection atleast two times to the remote server in case of distributed query, first connection to gather statistics and second connection to collect the actual data in the table.

Another disadvantage in case of distributed query is that though you have a WHERE clause in your query, you might notice that when the query is sent to retrieve the rows of a table in the remote server, SQL Server will just send a SELECT * FROM the remote table and then locally it filters out the necessary data after applying the predicates.

But in OPENQUERY, SQL Server sends the complete query to remote server SQL2 and resources of the SQL2 is spent in processing the query like parsing the SQL statements, generating a plan, filtering the rows as per predicates. Then the final resultset is sent to SQL1 which then just displayes what it received from SQL2.

Now, you tell me which one is better?

Further Reads:

Distributed Queries -


Guidelines for Using Distributed Queries -

How to pass a variable to a linked server query - (Community request for permissions required on the remote server)

Comments (6)
  1. Deepak says:

    Hi Sakthivel,

    i'm using Linked server four part queries , i want check whether the query process collected distribution statistics from the remote server or not ?. how to check this?.



  2. Deepak,

    Do you mean you want to check whether the source linked server account has permission to read distribution statistics from remote server?

  3. Sleeper says:

    Hi Sakthivel

    This articles about openquery really helps me develope my project.

    Thanks for your information.

  4. Deepak says:

    Hi Sakthivel,

    Thanks for your reply, im using sa login for linked server, what i want to know is some times my query is performing poorly (i.e, in case of  Linked server four part queries ).  

    Have often to join or query data from one or more server using a Linked Server, I need to know whats the difference

    between using an Open Query or an Exec at or four part Linked server also called Distributed query

    Which of the below three will give best performance and what are the prons and cons

    > Open Qyuery

    > Linked server four part queries are also called distributed queries

    > Exec () At

    Generally, the statements should be returning one or zero rows from the remote database instead of that SQL Server estimates that million plus rows will satisfy the query.As a result the remote query is not properly parameterized, too many rows are requested and it takes a few minutes to retrieve all data.

    so can you please suggest how to avoid the heavy remote copy and parameterization problem when we are using distributed queries.



  5. Lakshmi says:

    This article is good.  Thanks.

  6. Tim Curtin says:

    For best performance, built the complete query as dynamic SQL and execute the remote query with the dynamically built filter in the SQL.  You have to double-escape quotes if quotes are required in the filtered varchar value.

    Typically the hard SQL would look like 'select * from tbl where key=''varcharVal''.

    So, you have to double-escape them in the dynamic SQL. This performs wonderfully as the SQL engine passes the SQL to the remote engine for parsing, compilation and execution. I did this all day long between SQL and AS400 and had very acceptable performance on large tables (1M+ rows). Of course the SQL must be in the syntax of the remote db engine. SQL doesn't parse it locally, but only ships it to the remote server.



    SET @ID = SELECT CAST(ID AS VARCHAR(10)) FROM TABLE WHERE (some where clause).

    SET @SQL='SELECT Field1, Field2 FROM RemoteTable WHERE KEYField=''''' + @ID + '''''

    SET @ SQL= 'SELECT *

               FROM LocalTable l

                  INNER JOIN (SELECT * FROM OPENQUERY(RemoteTable,' + CHAR(9) + @SQL + CHAR(9) + ')) as a ON l.key=a.Field1'

    Print @SQL



    Should look like 'SELECT * FROM LocalTable INNER JOIN(SELECT * FROM OPENQUERY(RemoteTable,'SELECT Field1, Field2 FROM RemoteTable WHERE KEYField=''test''')) as a ON l.key=a.Field1'




    Caveat: I didn't test this example and just worked from memory. The number of quotes/parenthesis must be tested. So, view/debug the SQL as above. If it looks good, copy/paste it into another query window and execute it. If no error, then remark out the print/return statements and off you go. I'm pretty close though. 😉


Comments are closed.

Skip to main content