OpsMgr Eventlog analysis with Powershell

The following technique should already be understood by any powersheller. Here we focus on Operations Manager log entries, even if the data mining technique shows is entirely possibly – and encouraged :-) - with any other event log.

Let’s start by getting our eventlog into a variable called $evt:

PS >> $evt = Get-Eventlog “Operations Manager”

The above only works locally in POSH v1.

In POSH v2 you can go remotely by using the “-computername” parameter:

PS >> $evt = Get-Eventlog “Operations Manager” –computername RMS.domain.com

Anyhow, you can get to this remotely also in POSHv1 with this other more “dotNET-tish” syntax:

PS >> $evt = (New-Object System.Diagnostics.Eventlog -ArgumentList "Operations Manager").get_Entries()

you could even export this (or any of the above) to a CLIXML file:

PS >> (New-Object System.Diagnostics.Eventlog -ArgumentList "Operations Manager").get_Entries() | export-clixml -path c:evtEvt-OpsMgr-RMS.MYDOMAIN.COM.xml

and then you could reload your eventlog to another machine:

PS >> $evt = import-clixml c:evtEvt-OpsMgr-RMS.MYDOMAIN.COM.xml

whatever way you used to populate your $evt  variable, be it from a “live” eventlog or by re-importing it from XML, you can then start analyzing it:

PS >> $evt | where {$_.Entrytype -match "Error"} | select EventId,Source,Message | group eventid

Count Name                      Group
----- ----                      -----
1510 4509                      {@{EventID=4509; Source=HealthService; Message=The constructor for the managed module type "Microsoft.EnterpriseManagement.Mom.DatabaseQueryModules.GroupCalculatio.
   15 20022                     {@{EventID=20022; Source=OpsMgr Connector; Message=The health service {7B0E947B-2055...
    3 26319                     {@{EventID=26319; Source=OpsMgr SDK Service; Message=An exception was thrown while p...
    1 4512                      {@{EventID=4512; Source=HealthService; Message=Converting data batch to XML failed w...

the above is functionally identical to the following:

PS >> $evt | where {$_.Entrytype -eq 1} | select EventID,Source,Message | group eventid

Count Name                      Group
----- ----                      -----
1510 4509                      {@{EventID=4509; Source=HealthService; Message=The constructor for the managed modul...
   15 20022                     {@{EventID=20022; Source=OpsMgr Connector; Message=The health service {7B0E947B-2055...
    3 26319                     {@{EventID=26319; Source=OpsMgr SDK Service; Message=An exception was thrown while p...
    1 4512                      {@{EventID=4512; Source=HealthService; Message=Converting data batch to XML failed w...

Note that Eventlog Entries’ type is an ENUM that has values of 0,1,2 – similarly to OpsMgr health states – but beware that their order is not the same, as shown in the following table:

Code OpsMgr States Events EntryType
0 Not Monitored Information
1 Success Error
2 Warning Warning
3 Critical --

Let’s now look at Information Events (Entrytype –eq 0)

PS >> $evt | where {$_.Entrytype -eq 0} | select EventID,Source,Message | group eventid

Count Name                      Group
----- ----                      -----
4135 2110                      {@{EventID=2110; Source=HealthService; Message=Health Service successfully transferr...
1548 21025                     {@{EventID=21025; Source=OpsMgr Connector; Message=OpsMgr has received new configura...
4644 7026                      {@{EventID=7026; Source=HealthService; Message=The Health Service successfully logge...
1548 7023                      {@{EventID=7023; Source=HealthService; Message=The Health Service has downloaded sec...
1548 7025                      {@{EventID=7025; Source=HealthService; Message=The Health Service has authorized all...
1548 7024                      {@{EventID=7024; Source=HealthService; Message=The Health Service successfully logge...
1548 7028                      {@{EventID=7028; Source=HealthService; Message=All RunAs accounts for management gro...
   16 20021                     {@{EventID=20021; Source=OpsMgr Connector; Message=The health service {7B0E947B-2055...
   13 7019                      {@{EventID=7019; Source=HealthService; Message=The Health Service has validated all ...
    4 4002                      {@{EventID=4002; Source=Health Service Script; Message=Microsoft.Windows.Server.Logi...

 

And “Warning” events (Entrytype –eq 2):

PS >> $evt | where {$_.Entrytype -eq 2} | select EventID,Source,Message | group eventid

Count Name                      Group
----- ----                      -----
1511 1103                      {@{EventID=1103; Source=HealthService; Message=Summary: 1 rule(s)/monitor(s) failed ...
  501 20058                     {@{EventID=20058; Source=OpsMgr Connector; Message=The Root Connector has received b...
    5 29202                     {@{EventID=29202; Source=OpsMgr Config Service; Message=OpsMgr Config Service could ...
  421 31501                     {@{EventID=31501; Source=Health Service Modules; Message=No primary recipients were ...
   18 10103                     {@{EventID=10103; Source=Health Service Modules; Message=In PerfDataSource, could no...
    1 29105                     {@{EventID=29105; Source=OpsMgr Config Service; Message=The request for management p...

 

 

Ok now let’s see those event 20022, for example… so we get an idea of which healthservices they are referring to (20022 indicates" “hearthbeat failure”, btw):

PS >> $evt | where {$_.eventid -eq 20022} | select message

Message
-------
The health service {7B0E947B-2055-C12A-B6DB-DD6B311ADF39} running on host webapp3.domain1.mydomain.com and s...
The health service {E3B3CCAA-E797-4F08-860F-47558B3DA477} running on host SERVER1.domain2.mydomain.com and serving...
The health service {E3B3CCAA-E797-4F08-860F-47558B3DA477} running on host SERVER1.domain2.mydomain.com and serving...
The health service {E3B3CCAA-E797-4F08-860F-47558B3DA477} running on host SERVER1.domain2.mydomain.com and serving...
The health service {52E16F9C-EB1A-9FAF-5B9C-1AA9C8BC28E3} running on host DC4WK3.domain1.mydomain.com and se...
The health service {F96CC9E6-2EC4-7E63-EE5A-FF9286031C50} running on host VWEBDL2.domain1.mydomain.com and s...
The health service {71987EE0-909A-8465-C32D-05F315C301CC} running on host VDEVWEBPROBE2.domain2.mydomain.com....
The health service {BAF6716E-54A7-DF68-ABCB-B1101EDB2506} running on host XP2SMS002.domain2.mydomain.com and serving mana...
The health service {30C81387-D5E0-32D6-C3A3-C649F1CF66F1} running on host stgweb3.domain3.mydomain.com and...
The health service {3DCDD330-BBBB-B8E8-4FED-EF163B27DE0A} running on host VWEBDL1.domain1.mydomain.com and s...
The health service {13A47552-2693-E774-4F87-87DF68B2F0C0} running on host DC2.domain4.mydomain.com and ...
The health service {920BF9A8-C315-3064-A5AA-A92AA270529C} running on host FSCLU2 and serving management group Pr...
The health service {FAA3C2B5-C162-C742-786F-F3F8DC8CAC2F} running on host WEBAPP4.domain1.mydomain.com and s...
The health service {3DCDD330-BBBB-B8E8-4FED-EF163B27DE0A} running on host WEBDL1.domain1.mydomain.com and s...
The health service {3DCDD330-BBBB-B8E8-4FED-EF163B27DE0A} running on host WEBDL1.domain1.mydomain.com and s...

 

or let’s look at some warning for the Config Service:

PS >> $evt | where {$_.Eventid -eq 29202}

   Index Time          EntryType   Source                 InstanceID Message
   ----- ----          ---------   ------                 ---------- -------
5535065 Dec 07 21:18  Warning     OpsMgr Config Ser...   2147512850 OpsMgr Config Service could not retrieve a cons...
5543960 Dec 09 16:39  Warning     OpsMgr Config Ser...   2147512850 OpsMgr Config Service could not retrieve a cons...
5545536 Dec 10 01:06  Warning     OpsMgr Config Ser...   2147512850 OpsMgr Config Service could not retrieve a cons...
5553119 Dec 11 08:24  Warning     OpsMgr Config Ser...   2147512850 OpsMgr Config Service could not retrieve a cons...
5555677 Dec 11 10:34  Warning     OpsMgr Config Ser...   2147512850 OpsMgr Config Service could not retrieve a cons...

Once seen those, can you remember of any particular load you had on those days that justifies the instance space changing so quickly that the Config Service couldn’t keep up?

 

Or let’s group those events with ID 21025 by day, so we know how many Config recalculations we’ve had (which, if many, might indicate Config Churn):

PS >> $evt | where {$_.Eventid -eq 21025} | select TimeGenerated | % {$_.TimeGenerated.ToShortDateString()} | group

Count Name                      Group
----- ----                      -----
   39 12/7/2009                 {12/7/2009, 12/7/2009, 12/7/2009, 12/7/2009...}
  203 12/8/2009                 {12/8/2009, 12/8/2009, 12/8/2009, 12/8/2009...}
  217 12/9/2009                 {12/9/2009, 12/9/2009, 12/9/2009, 12/9/2009...}
  278 12/10/2009                {12/10/2009, 12/10/2009, 12/10/2009, 12/10/2009...}
  259 12/11/2009                {12/11/2009, 12/11/2009, 12/11/2009, 12/11/2009...}
  224 12/12/2009                {12/12/2009, 12/12/2009, 12/12/2009, 12/12/2009...}
  237 12/13/2009                {12/13/2009, 12/13/2009, 12/13/2009, 12/13/2009...}
   91 12/14/2009                {12/14/2009, 12/14/2009, 12/14/2009, 12/14/2009...}

 

Event ID 21025 shows that there is a new configuration for the Management Group.

Event ID 29103 has a similar wording, but shows that there is a new configuration for a given Healthservice. These should normally be many more events, unless your only health Service is the RMS, which is unlikely…

If we look at the event description (“message”) in search for the name (or even the GUID, as both are present) or our RMS, as follows, then they should be the same numbers of the 21025 above:

PS >> $evt | where {$_.Eventid -eq 29103} | where {$_.message -match "myrms.domain.com"} | select TimeGenerated | % {$_.TimeGenerated.ToShortDateString()} | group

Count Name                      Group
----- ----                      -----
   39 12/7/2009                 {12/7/2009, 12/7/2009, 12/7/2009, 12/7/2009...}
  203 12/8/2009                 {12/8/2009, 12/8/2009, 12/8/2009, 12/8/2009...}
  217 12/9/2009                 {12/9/2009, 12/9/2009, 12/9/2009, 12/9/2009...}
  278 12/10/2009                {12/10/2009, 12/10/2009, 12/10/2009, 12/10/2009...}
  259 12/11/2009                {12/11/2009, 12/11/2009, 12/11/2009, 12/11/2009...}
  224 12/12/2009                {12/12/2009, 12/12/2009, 12/12/2009, 12/12/2009...}
  237 12/13/2009                {12/13/2009, 12/13/2009, 12/13/2009, 12/13/2009...}
   91 12/14/2009                {12/14/2009, 12/14/2009, 12/14/2009, 12/14/2009...}

 

Going back to the initial counts of events by their IDs, when showing the errors the counts above had spotted the presence of a lonely 4512 event, which might have gone undetected if just browsing the eventlog with the GUI, since it only occurred once.

Let’s take a look at it:

PS >> $evt | where {$_.eventid -eq 4512}

   Index Time          EntryType   Source                 InstanceID Message
   ----- ----          ---------   ------                 ---------- -------
5560756 Dec 12 11:18  Error       HealthService          3221229984 Converting data batch to XML failed with error ...

Now, when it is about counts, Powershell is great.  But sometimes Powershell makes it difficult to actually READ the (long) event messages (descriptions) in the console. For example, our event ID 4512 is difficult to read in its entirety and gets truncated with trailing dots…

we can of course increase the window size and/or selecting only THAT one field to read it better:

PS >> $evt | where {$_.eventid -eq 4512} | select message

Message
-------
Converting data batch to XML failed with error "Not enough storage is available to complete this operation." (0x8007000E) in rule "Microsoft.SystemCenter.ConfigurationService.CollectionRule.Event.ConfigurationChanged" running for instance "RMS.MYDOMAIN.COM" with id:"{04F4ADED-2C7F-92EF-D620-9AF9685F736F}" in management group "SCOMPROD"

Or, worst case, if it still does not fit, we can still go and search for it in the actual, usual eventlog application… but at least we will have spotted it!

 

The above wants to give you an idea of what is easily accomplished with some simple one-liners, and how it can be a useful aid in analyzing/digging into Eventlogs.

All of the above is ALSO be possible with Logparser, and it would actually be even less heavy on memory usage and it will be quicker, to be honest!

I just like Powershell syntax a lot more, and its ubiquity, which makes it a better option for me. Your mileage may vary, of course.