Experiencing Alerting failure for Alert rules Updates – 08/08 – Resolved


Final Update: Monday, 08 August 2016 18:12 UTC

We've confirmed that all systems are back to normal with no customer impact as of 8/8, 17:50 UTC. Our logs show the incident started on 8/8, 16:32 UTC and that during the 1 hour, 18 minutes it took to resolve the issue, approximately 9% of customers experienced errors while trying to create, update, or delete alert rules.
  • Root Cause: Root cause is still under investigation, but initial review points to a code bug causing a service to not recover gracefully after a failure in a dependent service.
  • Lessons Learned: We are investigating options for automated mitigation that would ameliorate customer impact.
  • Incident Timeline: 1 Hour & 18 minutes - 8/8, 16:32 UTC through 8/8, 17:50 UTC

We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.

-Matt


Initial Update: Monday, 08 August 2016 17:03 UTC

We are aware of issues within Application Insights and are actively investigating. Some customers may experience failures when updating alert rules.
  • Work Around: None
  • Next Update: Before 08/08 19:30 UTC

We are working hard to resolve this issue and apologize for any inconvenience.
-Matt


Skip to main content