SharePoint AppFabric Error - Failed to connect to hosts in the cluster

Or “ErrorCode<ERRCAdmin040>:SubStatus<ES0001>:Failed to connect to hosts in the cluster” if you want the full error.

This is just a quick-one. You may see this error if you try and add a SharePoint server to AppFabric cluster. It can happen if there is a “cluster” configured but none of the servers exist anymore; in my case it was because I removed a SPServer without properly disconnecting it from the farm, just because I was lazy (pro-tip: don’t be lazy with SharePoint admin).

The key point here is that there are no more SPServers in the AppFabric cluster so it’s kinda complicated to remove it.

clip_image002

Anyway, here’s a quick guide on how to diagnose and fix the problem.

 

Finding the Ghost AppFabric Server(s)

We need to figure out what’s not connectable 1st. Given SharePoint tells us very little about what “hosts” it’s trying to connect to for the AppFabric cluster, we’ll just have to figure it out ourselves. Export the configuration XML of the AppFabric cluster:

Export-CacheClusterConfig -Path C:\Users\root\Desktop\AF.XML

Looking at the file, we can see the rogue entry:

<host replicationPort="22236" arbitrationPort="22235" clusterPort="22234" hostId="639275194" size="358" leadHost="true" account="CLOUDNET\SVC_FARM" cacheHostName="AppFabricCachingService" name="SP15-WFE.cloudnet.local" cachePort="22233" />

There is no server called “SP15-WFE” (any more), but evidently AppFabric didn’t get the memo. We need to remove this machine from the AppFabric cluster.

 

Removing the Ghost AppFabric Server(s)

The fix to remove said culprit machines is easy enough:

Unregister-CacheHost -HostName SP15-WFE.cloudnet.local -ProviderType SPDistributedCacheClusterProvider -ConnectionString \\SP15-WFE.cloudnet.local

That should run without drama. If not, there is the nuclear option of editing the configuration XML for AppFabric and importing it again but any mistakes here will result in the farm needing a complete rebuild again if it goes really wrong. Try and use “Unregister-CacheHost” in other words:

clip_image004

A quick export again shows us the configuration now has zero hosts. Now adding the SharePoint machine to the cluster works!

That’s it; happy SharePointing!

Cheers,

// Sam Betts