Analysis Services fails on a cluster with “An attempt was made to access a socket in a way forbidden by its access permissions”

The problem

On a clustered SQL Server 2008 R2 installation, we have installed SQL Server in two separate instances on each cluster node.

One Analysis Services resource will start, but the second one will not come online.

The error is shown in the application event log and in the MSOLAP\LOG\msmdsrv.log

Error(s): event id: 27 - failed to initialize SQL SQM timer
The service cannot be started: The following system error occurred: An attempt was made to access a socket in a way forbidden by its access permissions.

 

The cause

We found the first Analysis Services instance is listening on all IP addresses to port 2383 on the active node, so the second new instance cannot listen on its default port 2383 on the same node.

We could fail the second instance to the opposite node and it would come online fine. On the first node, we checked to see if the 2383 port is already in use.

 

1. Find the PID (process idenfitier) for msmdsrv.exe:

Using task manager, look at the list of Processes.
In the view menu, select the columns to view, and click PID (Process ID) to show that extra column.
Check the box in task manager that says "Show processes for all users"
Now sort the list of processes by name. Locate msmdsrv.exe process.
Found PID 5228

2. Checked which ports the first working AS instance is listening on using Netstat and cross referencing that pid

netstat -ano > output.txt
In the output.txt text look for the PID of the msmdsrv.exe to see which ports its listening on. Search (Control+F) for the PID number.
TCP 0.0.0.0:2383 PID 5228
TCP [::]:2383 PID 5228
So this indicates that the first msmdsrv.exe instance (pid 5228) is listening on ALL IP addresses.

 

Shortcut tip: you can pipe (|) output from netstat into findstr tool to narrow down the output quickly.

netstat -ano |findstr 2383

netstat -ano |findstr 5228

3. The problem is when the second instance start up, it cannot bind to its own IP address on port 2383 because the first instance has that port taken already.

TCP 0.0.0.0:2383 PID 5228
TCP [::]:2383 PID 5228

We need to limit each instance to only its dedicated IP address from the cluster resources and not all IPs on the node.

The default port in described in https://technet.microsoft.com/en-us/library/cc646023.aspx

Note the default port is 2838 for default instances, and this is retained as a setting in the OLAP\CONFIG\mdmsdsrv.ini from the shared disk.
To confirm, find the port tag <Port>0</Port>   in the msmdsrv.ini. Zero indicates dynamic port assignment, which is 2383 for default instances, and dynamic on named instances.

The resolution

Check each Analysis Services resource in the cluster manager, to view the properties.

On the General Tab note the setting "Use Network Name for computer name" See also https://support.microsoft.com/kb/198893

image

This setting was unchecked on the first working AS instance. This means that the resource perceives the node name and all IPs,  instead of just the cluster resource IP and cluster virtual name. 

The second failing AS instance was a victim of this setting on the first instance. This setting was causing the first AS instance to bind to the 2383 port on all IPs and preventing the new AS instance from coming online on the same node.

 

So we check the setting "Use Network Name for computer name" to limit the instance to see only its IP dependency and its virtual name.
Then restart the AS instance where the change was made - take offline. Bring online in the cluster manager.
That solved the problem.

Now both instances can coexist on the same node and by failed over successfully.

We can run Netstat -ano to see that each msmdsrv.exe process uses port 2383 but on different IP addresses.