How to resolve when Distribution Database is growing huge (+25gig)

Yes, I know, huge database is kind of relative, but generally if you see Distribution database growing more the 25gig it means the Cleanup processes is having a hard time deleting replicated transactions.  I’ll cover the how and why on Cleanup processes later, but for now I wanted to post a technique we’ve used to purge rows from the Distribution database.  This solution involves modifying the SQL Replication stored procedures to increase the number or rows being deleted per transaction.  If you’re uncomfortable making the code change, skip down to STEP 7).

This first posting coverage a “conservative” approach.  Later I’m post steps for a more “aggressive” solution.

1) script msrepl_commands cleanup proc and save original sp code

sp_helptext  sp_MSdelete_publisherdb_trans

2) change from CREATE to ALTER

ALTER PROCEDURE sp_MSdelete_publisherdb_trans

3) change all 3 locations from 2000 to 100000 rows

DELETE TOP(2000) MSrepl_commands . . .

4) script msrepl_transaction cleanup proc and save original sp code

sp_helptext sp_MSdelete_dodelete

5) change from CREATE to ALTER

ALTER PROCEDURE sp_MSdelete_dodelete

6) change both locations from 5000 to 100000 rows

delete TOP(5000) MSrepl_transactions . . .

7) Determine oldest day containing transactions

–(shows breakout by day, by hour.  Took 2 hours on 350million rows, 100gb distribtuion db)
SELECT T.[publisher_database_id]
,datepart(mm,[entry_time]) ‘month’
, datepart(dd,[entry_time]) ‘day’
, datepart(hh,[entry_time]) ‘hour’
    ,count(C.[xact_seqno]) ‘count of commands’
FROM [distribution].[dbo].[MSrepl_transactions](nolock) T
JOIN [MSrepl_commands](nolock) C
ON T.[xact_seqno] = C.[xact_seqno]
GROUP BY T.[publisher_database_id]
  , datepart(dd,[entry_time])
  , datepart(hh,[entry_time])
order by 1,2,3,4

–Or, just select oldest 10 rows and note the entry_time stamp.
–(select took 5 minutes on 350million rows, 100gb distribtuion db)

SELECT TOP 10 * FROM [distribution].[dbo].[MSrepl_transactions](nolock)

8) Execute cleanup via SSMS or a TSQL job to delete JUST oldest day.  (24 hours @ 5 days = 120), then continue to reduce the @max_distretention valued by a few hours for each run.

   EXEC dbo.sp_MSdistribution_cleanup @min_distretention = 0, @max_distretention = 120 

Example output: (4 hours to removed 340million rows)

Removed 3493 replicated transactions consisting of 343877158 statements in 15043 seconds (22859 rows/sec).

Hope you found this helpful,

Chris Skorlinski,  Microsoft SQL Server Escalation Services

Comments (3)

  1. Anonymous says:

    There is already a  job called "Distribution clean up: distribution" to execute the query given in step 8.

  2. Anonymous says:


    Wonderful comments on the internal procs the default job was kinds very slow process and your steps was a flush out.

    very impressive

    Thanks again

    Raman Gosala

  3. Anonymous says:

    We used this technique over many servers and it has worked perfectly fine for us!  Thanks for sharing this.

    I have one question through, we have a very large 630 G dist database.  We are unable to perform cleanup on this even after above suggestions have been implemented.  Do u have any other suggestions?  the problem is we are cannot stop the distribution jobs for some time as the replication is critical.  is there any way without stopping the distribution jobs.


    MSrepl_commands,1498027110 ,616052376 KB,610869472 KB,5047928 KB,134976 KB


    MSrepl_transactions,265722520  ,16021656 KB,15858120 KB,116216 KB,47320 KB