Service Pack Setup completes successfully on clustered instance but SQL Server Version is not updated on one node


I recently worked on an issue where one of my customers had successfully finished patching their SQL 2005 instances on a 3 node cluster. However the Build number was not getting updated on one of the nodes (NODE3) even after successfully completing the SP4 installation 

SQL version On problem node: Database Services (INSTANCEX) SP3 2005.090.4340.00  x86 ENTERPRISE

Expected version: SQL Server Database Service 2005(INSTANCEX) ENU SP4 2005.090.5000.00  x86  ENTERPRISE

OS version: Microsoft Windows Server 2003 family, Enterprise Edition Service Pack 2 (Build 3790) x86

Number of Nodes : 3 (NODE1, NODE2, NODE3)

  

Note: With SQL Server 2005 unlike the higher versions, the patch is applied to the active node of the cluster and it is the patch that will go off to all passive nodes and will apply the patch files remotely

Patching of the passive node is achieved by remotely creating a scheduled task

With SQL 2005 the patch does everything so once it's finished either all nodes are patched, or none are patched. If there is an issue during the patching phase, then the entire patch needs to be rolled back on all nodes. Therefore if you have 3 node cluster and the last node fails to patch, the patch will be rolled back on all nodes.

  

  • However in our case the patch was not rolled back on the nodes even though NODE3 wasn't patched correctly because the summary log still reports a success!
  • We collected the setup logs on the NODE3  and also the successfully patched nodes  to make  detailed analysis to check if features to be updated have been skipped for any reason
  • .We paused the NODE2 and NODE3 from the Cluster Administrator and ran the SP4 setup on the NODE3

We see that the setup succeeded (but completes in less than 30 seconds) showing us it may not be actually patching in the background


  • If you look at the summary file on the NODE2 AND NODE4 where the build is updated successfully:

Product Installation Status

Product                   : SQL Server Database Services 2005 (INSTANCEX)

Product Version (Previous): 5000

Product Version (Final)   : 5324

Status                    : Reboot Required

Log File                  : C:\Program Files\Microsoft SQL Server\90\Setup Bootstrap\LOG\Hotfix\SQL9_Hotfix_KB2716427_sqlrun_sql.msp.log

Error Number              : 3010

Error Description         :

  

  • Comparing this to NODE3 our problem node we do not see the

----------------------------------------------------------------------------------

Product                   : Database Services (INSTANCEX)

Product Version (Previous): 4340

Product Version (Final)   : 5000

Status                    : Success

Log File                  :

Error Number              : 0

Error Description         :

  

As per http://blogs.msdn.com/b/jorgepc/archive/2009/09/07/sql-server-2005-service-pack-does-not-update-all-the-installed-components.aspx

During the very beginning of the Service Pack setup process, the existing SQL Server services are discovered via WMI queries by the means of Wmiprvse.exeprocess; without the proper permissions, the process will fail to discover the installed components

  • That led us to suspect "Is the Service pack setup even able to enumerate/discover the SQL instances?

Here is how we investigated that:

  • We know that the SQL Server Configuration Manager tool also uses the same WMI queries to iterate through the services collection
  • When these tools generate the WMI queries, the SQL Server Web-Based Enterprise Management (WBEM) provider (Sqlmgmprovider.dll) is loaded into the Wmiprvse.exe process. Then, the SQL Server WBEM provider extracts and processes the information about the services of every instance of SQL Server 2005
  • So we decided to check if the SQL configuration manager on the problem node was able to enlist the services which would give us an idea if these WMI queries are working as expected

We open the SQL Server Configuration Manager, the right hand pane had the following error "The server threw an exception. [0x80010105]".

  • On the NODE2 and NODE1 we do not see these issues.

  

  • We also performed a small test to check WMI connectivity

  1. Open the WMI Control console: Click Start , click Run , type wmimgmt.msc and then click OK.
  2. In the console tree, right-click WMI Control , and then click Properties.
  3. Select the Security tab.
  4. Expand the Root folder and then navigate to Microsoft > SQLServer
  5. Search for namespace ComputerManagement
  6. Open the WMI Tester: Click Start , click Run , type wbemtest and then click OK.
  7. Connect to the namespace ComputerManagement (root\microsoft\sqlserver\ComputerManagement) and click on Query

SELECT * FROM RegServices
SELECT * FROM SqlService

  1. Make note if the query produced an error or if it was successful.

We see that in the Problem Node3, even after waiting for a long time we do not see any instances enumerated.

image

When we try the same on the NODE1 and NODE2 for the same namespace we see all the instances are being enumerated:

image

Connected to other namespace (e.g. root\cimv2, SELECT * FROM Win32_Service), it is all working fine

It appears that only the SQL namespace is having issue.

  

  

  • With this we confirmed that there a WMI connectivity issue on the Server for the SQL Namespace
  • However we had check the Setup logs to confirm if the Service pack setup has actually succeeded for the Database Engine Services, or if it was skipped due to any possible WMI issues?

  


We can see that setup able to enumerate all the details but weren't sure why it completes in ten seconds (Below is a snippet of setup log generated on NODE3 when the other two nodes were paused):

12/06/2014 13:10:24.272   Instance Details: INSTANCEX
12/06/2014 13:10:24.272     agentservicename = SQLAgent$INSTANCEX
12/06/2014 13:10:24.272     associatedhotfixbuild = 1520
12/06/2014 13:10:24.272     clustername = S02S702DR-CL6
12/06/2014 13:10:24.272     ftsservicename = MSFTESQL$INSTANCEX
12/06/2014 13:10:24.272     fullversion = 2005.090.4340.00
12/06/2014 13:10:24.272     hiveregpath = Software\Microsoft\Microsoft SQL Server\MSSQL.5
12/06/2014 13:10:24.272     id = MSSQL.5
12/06/2014 13:10:24.272     installsqldatadir = N:\MSSQL.5\MSSQL
12/06/2014 13:10:24.272     installsqldir = C:\Program Files\Microsoft SQL Server\MSSQL.5\MSSQL
12/06/2014 13:10:24.272     lcid = 1033
12/06/2014 13:10:24.272     name = INSTANCEX
12/06/2014 13:10:24.272     productcode = {A2E15F7E-A973-493A-AC36-B1841F8EE431}
12/06/2014 13:10:24.272     qfelevel = 4340
12/06/2014 13:10:24.272     servicename = MSSQL$INSTANCEX
12/06/2014 13:10:24.272     sku = ENTERPRISE
12/06/2014 13:10:24.272     sp = 3
12/06/2014 13:10:24.272     type = SQL Server Standalone Product
12/06/2014 13:10:24.272     vermajbld = 4035
12/06/2014 13:10:24.272     version = 9


"12/06/2014 13:10:44.133 Installing instance: INSTANCEX
12/06/2014 13:10:44.149 Enumerating passive cluster nodes
12/06/2014 13:10:44.149 No available passive nodes were found
12/06/2014 13:10:44.149 Finalizing cluster instance: INSTANCEX
12/06/2014 13:10:44.149 Waiting for all running passive nodes to complete their patch
12/06/2014 13:10:44.149 All running passive nodes have completed their patch
12/06/2014 13:10:44.149 Finalized cluster instance: INSTANCEX

12/06/2014 13:10:44.149 Enumerating passive cluster nodes
12/06/2014 13:10:44.149 No available passive nodes were found
12/06/2014 13:10:44.149 Finalizing cluster instance: INSTANCEX
12/06/2014 13:10:44.149 Waiting for all running passive nodes to complete their patch
12/06/2014 13:10:44.149 All running passive nodes have completed their patch

12/06/2014 13:10:44.149 Hotfix package completed
12/06/2014 13:10:44.149 Attempting to continue the 32 bit ngen queue
12/06/2014 13:10:53.431 Hotfix package closed"

We do not see it installing any files or creating a Log file!

****************************************
Compared to the good run on NODE2 pasted below we can see it creating a log file and installing msp files

++12/06/2014 06:56:02.818 Installing target: NODE2
12/06/2014 06:56:08.927 Installing file: sqlrun_sql.msp
12/06/2014 06:56:08.927 Copy Engine: Creating MSP install log file at: C:\Program Files\Microsoft SQL Server\90\Setup Bootstrap\LOG\Hotfix\SQL9_Hotfix_KB2463332_sqlrun_sql.msp.log

***************************************

  • So now we had a fair idea that the Service Pack setup is completing without actually patching the instance, because we possibly have issues discovering the installed components by the WMI queries

  • We now had to fix the SQL Server Configuration Manager, issue which was throwing the error "The server threw an exception. [0x80010105]".

We have a Public KB article that fixes the SQL configuration manager issue we are seeing:

As per: http://support.microsoft.com/kb/956013

To work around this problem, open a command prompt, type the following command, and then press ENTER:
mofcomp "%programfiles(x86)%\Microsoft SQL Server\number\Shared\sqlmgmproviderxpsp2up.mof"

  • We tried the above and it is reporting a success of parsing the .mof file. However it is stalled at "Storing data in repository" (it normally reports a "Done!" and comes out of the loop) Here it is just stalled..
  • Trying to restart the WMI service gets hung if we do it after parsing the .mof file.
  • We involved our Windows Platforms team to help us rebuild the WMI files, but we still received the same error when we try to open the SQL Configuration Manager.

What finally resolved it for us!

  • We came across http://support.microsoft.com/kb/2849344 , which speaks of the same "The server threw an exception [0x80010105]" error when the configuration manager is started.
  • However the KB article is applicable to Windows Server 2012 & SQL server 2008 R2 where the problem occurs because SQL Server 2008 R2 internally uses a deprecated 32-bit Windows .dll file (Msclus.dll). When Configuration Manager is started after the Cluster Automation feature is added, the system discovers that the necessary .dll file is missing, and this triggers the error

The work around provided by the KB is as follows

  1. Note The 32-bit version of the MSClus.dll can be found in the "%windir%\syswow64" directory.
  2. Register this .dll file by running the regsvr32 command from an elevated command prompt:
    1. Click Start, type command promptin the Search programs and files box.
    2. UnderPrograms, right-click Command Prompt, and then clickRun as administrator.
    3. At the command prompt, type the following command, and then press ENTER:

      Regsvr32 MSClus.dll

The current customer's environment is Windows 2003. However we decided to give the above workaround a shot as we were also in a clustered environment.

After registering the above MSClus.dll ! Voila…the configuration manager no longer throws the exception

  • We patched the NODE3 once again, which took slightly longer this time.
  • Rebooted the node and checked the SQL Server build which now reports 090.5000.00 correctly J


Comments (0)

Skip to main content