Experiencing Alerting failure for Many Data Types – 9/2 – Resolved


Final Update: Wednesday, 9/2/2015 19:30 UTC

We’ve confirmed that all systems are back to normal with no customer impact as of 9/2, 19:10 UTC. Our logs show the incident started on 9/1, 23:10 UTC and that during the 20 hours that it took to resolve the issue customers where unable to create or change alert rules for metrics.

 
Root Cause: The failure was due to a deployment to our configuration service. We have rolled back the changes in order to mitigate the issue.
Lessons Learned: We are working on alerting for this scenario which will inform us in the future on these types of errors before these types of changes make it to our production environment.
Incident Timeline: 20 Hours - 9/2, 19:10 UTC through 9/1, 23:10 UTC

We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.

-Application Insights Service Delivery Team


Initial Update: Wednesday, 9/2/2015 18:46 UTC

We are aware of issues within Application Insights and are actively investigating. Some customers may be unable to save alert rules and will see an error in the UI when this action fails.

Next Update: Before 19:46 UTC

We are working hard to resolve this issue and apologize for any inconvenience.

-Application Insights Service Delivery Team

 
 
 

Skip to main content