Final Update: Thursday, September 14th 2017 19:46 UTC
VSTS customers in South Central experienced three distinct periods of impact where performance in the service was degraded (see the top graph). At time of peak impact ~1,500 interactive users would have seen intermittent slow commands. The underlying issue was related to one of the five web nodes experiencing CPU spikes (see the bottom graph).
All three impact periods were mitigated by collecting diagnostics for root cause and then recycling IIS worker process.
We suspect the source of the issue is due to problems with the underlying infrastructure and are focused on the VM Host. We’ve engaged our partners in Azure who migrated the web node off the suspected VM Host. The Azure team is actively investigating this host. We’ll continue to watch the health on the affected web node to ensure the migration to a new host prevents further repeats.
Initial Update: Thursday, September 14th 2017 17:35 UTC
A potentially customer impacting alert is being investigated. Triage is in progress and we will provide an update with more information. Initial investigation show that this is reoccurrences of the issue we saw earlier this morning.
- Next Update: Before Thursday, September 14th 2017 18:45 UTC