This post is provided by Senior App Dev Manager, Vishal Saroopchand who asks the question, “How do you decide what Communications Stack to use in your Service Fabric applications?”
How do you decide what Communication Stack (Remoting, WCF, Custom Implementation) to use in your Service Fabric applications? Do you know how each communication stack performs? This post is to help shed some light on the performance characteristics that I observed with my recent experiment.
The questions I was attempting to answer
Which communication stack should I choose for inter-service communication on a low latency workload? What was the performance footprint of the out-of-the-box (OOtB) communication stack? How does my custom implementation perform against the OOtB options?
In order to answer this question, I decided to visualize the time it takes to move a message of variable size from a Web Proxy Gateway through a series of Stateful services and back to the Web API. Each Stateful service will listen on WCF, Remoting, a custom WebSocket and custom PubSub using Service Bus Topics. I will timestamp each visit and then plot it in a box-plot chart.
Here is a snapshot of one test. Please keep in mind, call durations will fluctuate per test, but generally the performance follows the same pattern.
PubSub is not shown in the above diagram as the total duration was roughly 1.8 seconds. Here is a zoomed out view showing all 4 Communication Listeners.
The bottom line is this: If you want simplicity and can live with a sub ~30ms sending messages between nodes, use the built in communication stack (Remoting or WCF). If you want better performance, consider building your own ICommunicationListener and handle your own data serialization.
Carefully plan your communication stack. Spend some time upfront to understand the characteristics of each, try improving it by taking ownership of serialization and/or the communication stack.
Consider building your own for low latency communication and use one of the built-in as a fallback. For custom communication stacks, remember, you must handle scenarios such as churn in your cluster where services move from one node to another. You should not assume an endpoint will remain stationary in your implementation. To test the soundless of your custom communication, use Chaos to simulate churn and see how your implementation perform.
Feel free to clone my experiment code here and include your own Communication stacks.
Premier Support for Developers provides strategic technology guidance, critical support coverage, and a range of essential services to help teams optimize development lifecycles and improve software quality. Contact your Application Development Manager (ADM) or email us to learn more about what we can do for you.