Stand Alone Service Fabric Error=EndpointProviderPortRangeExhausted

I ran into an issue with a customer wherein they had set up an on-premise, or stand-alone, Service Fabric cluster. They had deployed several services to the cluster and some of the services would seem to randomly fail to start. The only way to see some information about the errors was to take a look in the Application Event logs on the nodes.

We observed errors that looked like this:

End(ActivateServicePackageInstance): AppId=mtsDeviceServiceType_App0, AppVersion=1.0 ServicePackageName=ContainerActorPkg, ServicePackageActivationContext=, ServicePackageVersionInstance=1.0:1.0:131575929604768573, Error=EndpointProviderPortRangeExhausted

It was odd to me because I had never seen this before in clusters hosted in Azure. After speaking with support, it turns out that stand-alone clusters have a much lower ephemeral port range of 30 ports by default. An Azure cluster is set much higher, which is why I had never seen the error there.

In this case the ephemeral port range was being used up on a first-come-first-served basis by whichever service started and got a port first, so the erroring applications looked random.

To correct the error, increase the # of application ports by modifying the standalone cluster templates (like ClusterConfig.X509.MultiMachine.json) to allocate a larger range:

"nodeTypes": [
{
"name": "NodeType0",
...
"applicationPorts": {
"startPort": "20001",
"endPort": "20031" ß use a higher number giving you a larger range
},
}

References

/en-us/azure/service-fabric/service-fabric-cluster-creation-for-windows-server