“Rough and Tough” guide to identifying patterns in Transaction Logs


Often times, either due to a misconfiguration/bug/solar eclipse or otherwise, customers call into Microsoft Product Support Services complaining that their Exchange server is churning out transaction logfiles at an alarming rate.  For every instance of this symptom, there are at least a dozen reasons why this is happening.  Regardless, there’s never been a good way to parse the transaction logs and extract any useful patterns.  In lieu of rolling up my sleeves and actually writing code to accomplish such a task, I’ve slapped together a bunch of utilities that will do the job.  Ugly?  Sure.  Useful?  You bet.  Having used it against many customer issues, I can attest that this actually works, and works quite well.

 

1.    Download the “Unix for Win32” utilities from http://downloads.sourceforge.net/unxutils/UnxUtils.zip?modtime=1172730504&big_mirror=0

2.    Extract all files from the UnxUtils\usr\local\wbin subsirectory to C:\UNIX

3.    Download strings.exe from http://live.sysinternals.com/strings.exe, and place strings.exe into C:\UNIX

4.    Make a C:\TMP directory (Unix tools need a Win32 equivalent of /tmp)

5.    Make a directory for all your transaction log files (i.e. D:\customers\test), and place all the logs in this dir

6.    From a cmd prompt, navigate to your C:\UNIX dir

7.    Run the following command:

 

strings -q -n 16 D:\customers\test\*.log | cut -f3 -d: | sort | uniq -c | sort | tee c:\log-output.wri

 

 

What this is doing:

 

·         Identifies all strings in the logs greater than 16 chars

·         Removes the D:\customers\test\E00xxxx.log: from the output

·         Sorts the output

·         Finds all duplicate records, and retains a count

·         Sorts the final output (ending with the largest # of occurrences)

·         Writes all the output to c:\log-output.wri (use WordPad / write.exe to open; notepad.exe mangles the output)

 

 

If you’re running this on Windows 7 or above, you’ll have to modify the output directory as follows (as it won’t let you write directly to the root of the C: drive) …

 

strings -q -n 16 D:\customers\test\*.log | cut -f3 -d: | sort | uniq -c | sort | tee c:\users\yourname\log-output.wri

 

 

The output will be sorted from the least number of repeating occurences to greatest, so crack open that log-output.wri file, scroll to the bottom, and commence spelunking!

Comments (43)

  1. Dr. Green says:

    Well, not bad, but unfortunately it gave me this result after running through around 20 logfiles:

        <.. cut ..>

        10 <p class=MsoNormal><span style=’font-size

        11 font-family

        14 <p class=MsoNormal><o

        14 <p class=MsoNormal><span style=’color

       192 E

    (These logs were generated by E2K7)

    E? Is this longer than 16 characters?

  2. – Given that E12 transaction logs are only 1MB now, you’ll improve your odds of success by sampling more logs.

    – Try swapping the 16 with 8, and see if that yields better/more results.

    thanks,

    -scott

  3. Tony says:

    This is a great string parser routine and I’ve used for other things. Thanks so much Scott. I know many of my peers refer to it as well, great impact.

    As I use this as needed, I copy all of the unix exe’s into the main unix folder. Once done this puppy runs quite well.

    Thanks again Scott!

    -T

  4. Ron says:

    What should I be looking for?  I’m trying to find out why we are having an excessive spike in log creation lately.  Here’s the last bunch of lines from the results:

       318 00) Eastern Time (US & Canada)

       319 quoted-printable

       327 08

       328 12

       334 7bit

       347 55

       353 46

       354 Mon, 15 Sep 2008 10

       358 10pt; COLOR

       361 15 Sep 2008 15

       361 0in;

       362 15

       397 12.75pt’>

       404 Monday, September 15, 2008 10

       407 text/plain;

       416 (S

       426 from TC3705.domain.com ([172.22.227.140]) by TC3647.domain.com with Microsoft SMTPSVC(6.0.3790.1830);

       436 schemas-microsoft-com

       446 schemas

       462 3px;;text-align

       513 Mon, 15 Sep 2008 11

       542 10pt; FONT-FAMILY

       563 54

       586 MAILTO

       650 application/ms-tnef;

       685 binary

       813 #d4d0c8; BACKGROUND-COLOR

       815 Produced By Microsoft Exchange V6.5

       845 10.0pt;font-family

       916 urn

      1133

      1168 1.0

      1253 p></o

      1637 content-classes

      1677 #d4d0c8; BORDER-TOP

    619297

  5. Scott Oseychik says:

    Can you re-run the same command, only this time instead of looking for strings 16 chars (or greater), let’s go with 8.  The command line will look like this:

    strings -q -n 8 D:customerstest*.log | cut -f3 -d: | sort | uniq -c | sort | tee c:log-output.wri

  6. Gsalgun says:

    Does it work on Exchange 2007?

  7. Hi Gsalgun,

    Yes, this technique will work on Exchange 2007.

    Kind Regards,

    Scott

  8. Matthew says:

    I get ‘cut’ is not recognized as an internal or external command, operable program or batch file.  What am I doing wrong?

  9. It’s not in your path.  Place cut.exe in your path, and you should be good to go …

    Scott

  10. Sbright says:

    Hi Scott,

    Thanks for the guide.  I’ve got an Exchange 2003 environment where "Junk E-mail Rule" is the highest repeating occurence.  Any ideas why this would be?

    Thanks

    Steve

  11. Scott Oseychik says:

    Hi Steve,

    In conjuction with analyzing the transaction logs, you can also use Exmon to [hopefully] pinpoint a specific user who may be thrashing the server; perhaps their Junk E-mail Rule is corrupted:

    http://msexchangeteam.com/archive/2005/04/06/403409.aspx

    http://www.msexchange.org/tutorials/Microsoft-Exchange-Server-User-Monitor.html

    Thanks,

    Scott

  12. Silju Paul says:

    Hey Scott,

    Many thanks for this.Worked well at the first shot!!!

    Silju

  13. Glad to hear it helped out, Silju!

    Regards,

    Scott Oseychik

  14. GWhite says:

      1028 Microsoft Exchange Server

      1033 text-decoration

      1066 color

      1090 style=3D’font-size

      1146 Content-Type

      1239 10.0pt;font-family

      1319 IPM.Schedule.Meeting.Resp.Pos

      1365 Received

      1470 <b><span style=’font-weight

      1485 IPM.Schedule.Meeting.Request

      2212 Updated

      2330 style=’font-size

      2927 font-family

      2959 [1/21/2010 16

      4346 IPM.Note.StorageQuotaWarning

      4363 EDWNEDWNEDWNEDWNEDWNEDWNEDWNEDWNEDWNEDWNEDWNEDWNEDWNEDWNEDWNEDWNEDWNEDWNEDWN

      4434 jRA1jRA1jRA1jRA1jRA1jRA1jRA1jRA1jRA1jRA1jRA1jRA1jRA1jRA1jRA1jRA1jRA1jRA1jRA1

      4484 NY0QNY0QNY0QNY0QNY0QNY0QNY0QNY0QNY0QNY0QNY0QNY0QNY0QNY0QNY0QNY0QNY0QNY0QNY0Q

      6495 ////////////////////////////////////////////////////////////////////////////

      8294 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

    This is what I’m seeing after analyzing 200MB of log files…any thoughts?

  15. Based on the data, it seems someone is initiating a mail loop (or is merely blasting the same content repeatedly) … note the WNED (I’m guessing it’s short for PWNED).

    My recommendation: use ExMon to identify the users with the highest amount of activity, then disable their mailboxes one-by-one until the transaction log growth stops.

    http://www.microsoft.com/downloads/details.aspx?FamilyId=9A49C22E-E0C7-4B7C-ACEF-729D48AF7BC9&displaylang=en

  16. GWhite says:

    I’ve run a 30 minute ExMon trace and have dumped the results to a CSV file so I can manipulate the data in Excel.

    What column of data would be the best one to focus on for the "highest amount of activity" by a user?

    Thank you!

  17. I believe the column you’re after is "Total Bytes" (or similar) … we want to identify who is sending the most data.

    Scott

  18. GWhite says:

    Wasn’t sure if I should focus on "Packets", "Bytes Out" or "Log Bytes"…

    Thanks!

  19. Ah … "Log Bytes."  That’s the one.

  20. J Mynes says:

    So what is the "Log Bytes" counter?  I did not find it in the documentation.

  21. Use Exchange User Monitor (Exmon) server side to determine if a specific user is causing the log growth problems. Sort on CPU (%) and look at the top 5 users that are consuming the most amount of CPU inside the Store process. Check the Log Bytes column to verify for this log growth for a potential user. If that does not show a possible user, sort on the Log Bytes column to look for any possible users that could be attributing to the log growth.

    Hope this helps,

    Scott Oseychik

  22. Thanks for sharing the blog Scott, so i must run this in my workstation after copying the whole Exchange Transaction log into my local hard drive ?

  23. Hi Albert,

    Yes, this would be performed locally on a workstation (off the Exchange Server).  While you don't have to copy *all* the logfiles to your workstation, the more data that these tools run against, the more relevant the result set becomes.

    Kind Regards,

    Scott Oseychik

  24. Tareq Abb says:

    Hi Scott,

    i am not sure if you can help me, but i did the steps outlined in your blog and i got strings that contain only a single letter, i.e. " DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD" which expanded up to one page of a wordpad. do you know what could be the problem? i already lcoated the user causing the problem and place him on a separate database.

    any ideas?

    thank you in advance 🙂

    Tareq

  25. Hi Tareq,

    One of the techniques I've found helpful if this approach doesn't produce helpful output is to reduce to number of commands until you identify the one that is causing issues.

    For example, try this:

    strings -q -n 16 D:customerstest*.log | cut -f3 -d: | sort | uniq -c | sort

    If this still is producing garbage, try this:

    strings -q -n 16 D:customerstest*.log | cut -f3 -d: | sort | uniq -c

    If the same, then this:

    strings -q -n 16 D:customerstest*.log

    Hope this helps locate where the "garbled output" is coming from.

    Regards,

    Scott Oseychik

  26. Steve R says:

    Hi Scott,

    Thanks very much for posting this, it's been a great help. I used it today as our transaction logs are filling in bursts at a rapid rate, filling about 50 logs per minute for 5 minutes then doing it again about 30 mins later.

    From the info in the output file, I found that every log in one of these bursts seems to have the same email  about 10 times, so I think there is a stuck message or a message bouncing around in Exchange.

    We have 3 Exchange servers each with 2 Storage Groups. The transaction logs are filling only for some of the SGs but at least one on all 3 servers. I can see the email addresses of the users in the email and they are on 2 of the 3 servers. The servers that those 2 users are on have been dismounted but the other server still gets the message in it's transaction logs.

    Do you have any idea how I can find out where the stuck message is and remove it?

    I ran ExMon but couldn't see any particular user thrashing the server.

    Regards,

    Steve

  27. Hey Steve,

    Sorry to respond with more questions …

    – Do these same users show up with the same level of frequency in the Message Tracking logs?  If so, is the Message-ID consistent?

    – For these users, does OWA (show a consistent view of the Outbox as Outlook (thinking: stuck message repeatedly trying to send)?  You could also login to these "suspect" mailboxes with either Outlook or MFCMAPI (http://mfcmapi.codeplex.com) in online mode to double-check, too.

    – Finally, what server & client versions are in use here?  The reason I ask is that in older builds, Exchange would allow Outlook to repeatedly try & submit messages even if the user was over their send/receive quota.  Also, I fixed a bug in the store many moons ago where a VSAPI-based virus scanner would induce a loop if circular server-side rules were present … see support.microsoft.com/…/923799.

    Regards,

    Scott

  28. Jeremy B says:

    Hey Scott,

    Thanks for this post.  Does this same technique apply to Exchange 2010 as well?  I tried it and almost always end with the following in the top spot.  ESE Super ECCXORChecksum.  I searched on this and cannot find any information about it.

    Thanks,

    Jeremy

  29. Hi Jeremy,

    A few clarification questions if I may …

    1) I'm presuming you have "Enable background database maintenance (24 x 7 ESE scanning)" enabled, correct?

    2) Does the high logfile generation only reproduce during Online Maintenance?

    3) Also, what is your Maintenance Schedule set to?  Any changes to it recently?

    Regards,

    Scott

  30. Kevin S says:

    Why wasn't powershell used to accomplish this?

  31. Hi Kevin,

    We initially tried using Powershell, but the performance was unacceptably slow.

    Regards,

    Scott Oseychik

  32. P.Vaihinen says:

    When I use strings -q -n 8 I got last 2 lines to:

       648 M?

     67408

    Why is the last line empty, or is it a special character like CR or LF?

    This is from 1 minute sample that had 90 MB of logs.

    Could it be that some device or client causes lot of CR or LF in the logs?

  33. Hi P.Vaihinen,

    Your observation is accurate; seems like a client is passing in a blank line (of at least 8 chars long) in addition to a likely CRLF.  You may find support.microsoft.com/…/972705 helpful in your troubleshooting as well.

    Regards,

    Scott Oseychik

  34. Xeus says:

    really wish this had helped, all i am seeing in output file is bunch of pages with DDDDDDDDDDDD and OOOOOOO and then the last page with a user account name and then DocumentSummaryInformation

        15 SummaryInformation

        17

        27 ************************$.

        27 Exchange.ContentsSyncData$.

    what does this mean?? please help.

  35. Tareq says:

    Hi Scott,

    Thanks for this helpful post, I get ESE super ECCXORChecksum on the top of the list, should I disable the online maintenance for testing purposes?

    also there are some user names in this format FIRST ADMINISTRATIVE GROUP/CN=RECIPIENTS/CN=

    can you please advise

  36. Hi Tareq,

    I wouldn't recommend disabling OLM here; ECCXORChecksum is simply what's being written to & read off of disk by ESE (the storage engine).  As far as the user name format listed above, that's referred to as the LegacyExchangeDN.  If you have one user that is recurring more than others (regardless if it's an SMTP address, LegacyExchangeDN, or otherwise), I'd investigate that user activity more in-depth.

    While my approach is admittedly hacky (at best), also check out our Exchange Team's blog post on this subject:

    blogs.technet.com/…/troubleshooting-rapid-growth-in-databases-and-transaction-log-files-in-exchange-server-2007-and-2010.aspx

    Regards,

    Scott Oseychik

  37. Pavan Maganti says:

    Strings.exe download link is broken.

    Correct Link: technet.microsoft.com/…/bb897439.aspx

  38. @Pavan – thanks for the feedback!  I've updated the link to point directly to our sysinternals repository.

    Regards,

    Scott Oseychik

  39. Bastian says:

    Hi Scott,

    I'm also struggling with the problem of a massive transaction log growth since a few days.

    After a lot of research I found your article and let the command run over aprox. 6000 log files.

    The last lines are these, what does this mean? Do you have any idea?

    40641                             <xmpG

    41106        stEvt

    41924

    50416  W. Europe Standard Time

    64700                 </rdf

    68455                 <rdf

    118658  ************************

    288745                    <stEvt

    I don't know how to interpret this.

    Many many thanks in advance!

    Best regards,

    Bastian

  40. Emil says:

    Hi. Thanks for this, however, just like Bastian, my top "users" are;

    <xmpG

    W. Europe Standard Time

    </rdf

    ************************

    <stEvt

    as well as

    0000000000 65535 f

    and SMTP

    However, the bulk of data is "DDDDDDDD" or "OOOOOOOO" along with some "RRRRRR". Those occupy about 95% of all pages, so I guess whatever those are, they are the cause for my log growth. But what does it mean?

  41. Hi Emil and Bastian,

    Sorry about the delayed reply.  Inferring from the output ("W. Europe Standard Time") and the stEvt string, it sounds like both issues are due to some sort of "Recurring Meeting Gone Wild."  My advice is to check out some of the tips & tricks on how to nail down this type of issue in the two following blog posts:

    1) blogs.technet.com/…/calendaring-the-recurring-meeting-and-disaster.aspx

    2) blogs.technet.com/…/troubleshooting-rapid-growth-in-databases-and-transaction-log-files-in-exchange-server-2007-and-2010.aspx

    Hope this helps,

    Scott Oseychik

  42. Deivid de Foggi says:

    This is a very delayed comment based on the post creation date, but still valuable 🙂 I'm a PFE and worked on a case where the customer experienced a huge log growth and confirmed, using this post as a reference, that the 2nd on the string rank was "IPM.Note.EnterpriseVault.PendingArchive". Moreover, the 1st one was a mailbox displayname. With this in mind, customer and Symantec found some issues on EV archive settings.

    Very useful!

  43. hank says:

    Sure, the threads here are old and grey, but i too was seeing extended strings of 'D' 'O' and 'H' in the output from this article.  I noticed several of you had the same answer and thought this may be helpful:

    technet.microsoft.com/…/gg549096%28v=exchg.150%29.aspx

    Specifically, check out the fill pattern character for page zeroing.  I suspect these strings are maintenance operations.