Troubleshooting LogReader Timeout executing sp_replcmds


Chris Skorlinski
Microsoft SQL Server Escalation Services

SQL Server Transaction LogReader Agent timeout executing sp_replcmds occurs most often when a large transactions has been written to the Published database transaction log, or there are a large number of un-replicated commands.  Either problem requires the LogReader to read more log entries than the timeout parameter allows causing the “query timeout expired” while executing sp_replcmds.

Use these commands below to explore the transaction log to see if your log file matches either of these patterns. 

–Troubleshooting LogReader Timeout executing sp_replcmds
–Find last distributed and "next to" distribute transactions.

USE <published database>

dbcc opentran

Replicated Transaction Information:
        Oldest distributed LSN     : (54:4986:1)
        Oldest nondistributed LSN : (55:4803:2)

–Count number of pending to replicated log entries per transaction
select  [Transaction ID], count([Transaction ID]) as ‘Count log entries’
from::fn_dblog(’55:4803:2′,null)   — Oldest non-distributed LSN : (55:4803:2)
where [Description] = ‘REPLICATE’
group by [Transaction ID]
GO

Transaction ID Count log entries
————– —————–

0000:0001a659  3
0000:0001a935  2
0000:0001a658  500001
0000:0003deff  3
0000:0001a937  3
0000:0001a939  3

Is the pattern of cmds/transaction what you expect for your application?

Are there a large number of logs records per given transaction which can account for the LogReader Timeout?

If a large number of cmds/transaction occur, increasing the query timeout for the LogReader  Agent Profile is recommended solution.  You can also set the Agent ReadBatchSize = 1 and VerboseHistory = 2 to get more feedback from the Agent. 

For a single transaction with 100K commnds, the LogReader history may show “waiting for response”.  You can execute sp_who2 and look for ProgramName = <log reader job> and see if the CPU, and IO counters continue to climb.  If these counters are increasing, the LogReader is still processing the transaction log. 

Remember to set Agent back to default setting once it has processed the large batch.

Replication Log Reader Agent
http://msdn.microsoft.com/en-us/library/ms146878.aspx

Using “Verbose history agent profile.” while troubleshooting Replication
http://blogs.msdn.com/b/repltalk/archive/2010/07/13/using-verbose-history-agent-profile-while-troubleshooting-replication.aspx

Comments (5)

  1. Jason says:

    Can you explain the impact of a timeout?  Are the replicated transactions lost (to the distributor and any subscribers)?

  2. ReplTalk says:

    The Distribution Agent will apply a "batch" of changes in a single transaction.  If the Agent fails it will rollback any partial changes that were part of that "batch". If the Agent is scheduled to restart, it will again timeout and again rollback changes.  You'll be caught in an endless loop.

    When you increase the timeout value, then rerun the Agent those changes will be reapplied to the Subscriber.  You can also temporarily decrease the BatchSize to commit a smaller batch of transactions.

  3. ReplTalk says:

    dbcc opentran and fn_dblog

  4. Vini says:

    Hi, I am new to the concept of replication. Can you please tell me how to change the Query timeout parameter?

  5. Albert says:

    You can also add a new profile in the Log Reader Agent Profiles and already in this new profile increase Logintimeout and Querytimeout.  In my case it's worked.