I have a project that requires building an online system with large data flow to satisfy two groups of users. One group of users are data publishers, They could be retail customers using web browser to call our online services, or partners who call our services through web Api service, or client application. The other group of users are receiving and consuming the data sent from the first group of users. The system can be designed by using "Publish-Subscribe" software pattern (http://en.wikipedia.org/wiki/Publish/subscribe). In Azure cloud, using Service Bus as messaging framework would be the best fit for the described business scenario. But, we need to address a list of technical questions, in order to come up with a high level design.
1. How do we enable anonymous users to send data to our system?
There are basically two ways to access Azure Service Bus, RESTful API or SBMP (Service Bus Messaging Protocol), both of which require secured connection (by connection string, or certificate to create token). Letting users to access our Service Bus directly doesn't sound a realistic solution.
2. How to distribute data interface to geographical locations to serve users at anywhere in US?
Azure Service Bus are configured by data center location, the same as Azure WebRole instances. It is important to assemble the two components together by location in order to follow principle of the lowest network latency and the highest service resilience.
Here is a high level architecture design.
Since most users are on internet, web interface is designed to accept data published by users. We implement a web Api service that exposes data publishing "endpoint". It is by HTTP connection with SSL cert authentication to guarantee a secure server access to users. However, asking every user to be authorized before he/she could send data is a little bit challenging here.
Considering that users could be from anywhere in US, the web Api service is deployed to two data centers (i.e. US west, US east). To accommodate geo-distributed design requirement, we've prepared for two Service Buses, one in each data center (USW and USE).
All the WebRole instances have connections to the Service Bus instances (SBMP), but their primary data transferring target is the Service Bus instance in local data center. This means data received by webRole instance in US west data center is sent to Service Bus instance in the same US west data center. However, in case the "send" action fails due to local Service Bus unavailability issues, the same data is resent to the service bus in remote data center (US west data center in this example.) This will minimize risk of possible data loss during the transfer process.
All the published data is organized by topic name. We manage a list of TopicClients, each of which "maps" to a specific topic name on a specific Service Bus instance. Therefore, we can use SubscriptionClient to gather all data from the other end of two Service Bus instances, and filter data by topic name.
I will continue to discuss on this topic in the next post. Please stay tuned.