Best practices for building Windows Azure service

​Hello, Guys

I like to share some practices to build Azure based services which are used for our services:

· Have a ETL process to get diagnostic logs from Azure table storage to SQL Server or other process to post processing.

· Design a set of commonly used queries for Ops team to find the root cause.

· Have detail and verbose logging , such as assigning unique error number for different issues, make sure the error message have enough information for debugs.

· Having a release process. You need a release branch, stab branch and dev branch, and use formal way to handle integration between the branches to avid issue related to code change.

· Automated as much as possible, such as building your package step, deployment step, persist your production config into source depot, so that you can avoid human mistake.

· Having a detail service upgrade and rollback plan, and also make sure you tested the plan.

· Have security expert review your service, and implement some feature related to security.

· Don’t block your client. You can use Async way to allow user to submit request, and return request id, then user can always get the request status later. Note, end user can always implement status monitoring on top of your API.

· Write blogs and setup a forum to communicate with your customer. Watch the issues reported from customer, and find the root cause.

· Collect some statistic data and help you make decision for further functionality. For example, we collect each database size, version, request type, so that we can better understand customer.

 In term of testing, here is some practice:

· Testing your service in production. In many case, it is easy to deploy your service to Windows Azure, and not expose to end user. This enable you can do end to end testing as early as possible. Also, you don’t need spend time to simulate your product environment any more. And you can find issues only happens in production.

· Deployment your service to PPE (Pre Production Environment), and allow internal customer to use your service and test your service.

· Design a ping protocol for your service (for example, for a web site, it is a HTTP URL), and use external provider, Gomez to monitoring your service.

· Implement external Application Monitoring. Unless the ping protocol, which only make sure your service is on or not, but not whether it functional well. The application monitoring will run common user scenario tests in production as real user, and monitoring the availability of the functionality of your service. 

· not only You need test the functionality of your service, you also need to test other part of the service, such as logging, scalability , and error handle and fault tolerant to other dependent services,etc

· Implement continuous integration and continuous deployment. Continuous integration is a software development practice where engineers integrate frequently, leading to multiple integrations per day. Each integration is verified by an automated build and test to detect integration errors as quickly as possible. In your service case, you can implement a continuous deployment process, such as build your service, deployment it to test environment, and run sign-off tests against the test service. The process should be automatically happened. Enable this for your service enable you fast release your service, such as RTO-1.

· Experimental in production. It is possible that you can use your real user to test your service, such as some new idea or new implementation. You need to implement an risk and exposition control. For example, it is possible that you deploy your beta service to your stage environment, but both product and staging get request from the same request. You can config the stage deployment to take 1% of the request (such as adjust polling request frequency), so that it wouldn’t have big impact on many users. Note, it use the same ways as ramp-up deployment, which deploy your service in an incremental way.

· Use a well-know test framework, such as MsTest. The reason is that you can take advantages of other people's work related to mstest, such as running form the cloud, loading testing ,etc.