Recently I was working with a customer who had a file server experiencing high CPU in the WMIprvse.exe process. We received multiple user dumps and noted that someone or something was running the same query again and again. We needed to figure out what was running the query in a tight loop, causing the high CPU.
Figure 1 – Task Manager on FileServer
Before we get into the exact troubleshooting steps, let me provide some background on WMI. Winmgmt is the WMI service within a SVCHOST process running under the LocalSystem account. In all cases, the WMI service automatically starts when the first management application or script requests connection to a WMI namespace. For more information, see Starting and Stopping the WMI Service. To avoid stopping all the services when a provider fails, each provider is loaded into a separate host process named “Wmiprvse.exe”. This also allows each instance of Wmiprvse to run under a different account with varying security. For more details you can look at the MSDN documentation on WMI.
I dumped out all the services in the various svchost.exe processes. You can do this from a command prompt by running the tasklist /svc command. In my instance, I found that the WinMgmt service was running in svchost, PID 452 (PID number will vary). Someone had to be making RPC calls to this svchost.exe process to run the WMI queries. It could be some local process on the machine; it could even be a process on a remote machine.
At this point I requested user dumps of PID 452 from the customer. This would allow me to determine who was making the RPC calls to svchost.exe to run the WMI queries. While the customer was uploading the dumps, we decided to get a Network Monitor trace to see if the RPC calls were coming over the network.
Immediately, I could see a lot of RPC traffic to the svchost.exe process(PID=452).
Figure 2 – Network Monitor Output from the FileServer. Notice the Source and destination ports and IP addresses. IP addresses are hidden by the aliases
Looking at the RPC payload, I could see the text of the WMI query. You can see this in the Hex Details Pane. The query that was running in a loop was “Select * from Win32_Process”. Looks like I found the source of the WMI queries.
At this point, we got the source IP for the RPC packets. We logged into the machine, and brought up the Task Manager.
Figure 3 – Task Manager on Remote Machine(Machine1)
Immediately we saw that there was some script running inside a Wscript.exe process. At this point I was pretty sure that this script was the culprit. The customer was not sure what this was, and was not comfortable terminating the process. To prove my suspicion, I had him open a command prompt and run the following command, netstat –ano.
Figure 4 – Netstat output from Remote Machine
From the output in Fig. 4, I could see a TCP connection created by PID 3532 (wscript.exe). Looking at the local and foreign addresses from the above output, they matched up exactly to what we were seeing in the Network Monitor trace.
In the above case, we already had our suspicions on the wscript.exe process; however, sometimes it might not be that easy. In that case, we could have used the netstat output to look at all connections to the file server (220.127.116.11). If there were multiple connections, then we can also narrow it down by the port number. Based on that, we could have found the PID responsible.
The customer called me later in the day, and told me that they had recently updated their scripts. One of their scripts had a bug which was running WMI scripts in a tight loop. Fixing the script caused the problem to go away.
Had the query being coming from a local process, I would have had to debug the svchost.exe process, and figure out who was making the WMI calls. However, since we could see the traffic on netmon, we didn’t need to use the debugger. Interesting way to get to the root of the problem without using a debugger!
|Share this post :|