Experiencing Alerting failure for Availability Data Type – 02/07 – Resolved


Final Update: Thursday, 08 February 2018 01:11 UTC

We've confirmed that all systems are back to normal with no customer impact as of 02/08, 00:00 UTC. Our logs show the incident started on 02/07, 21:30 UTC and during that 2.5 hours that it took to resolve the issue , customers would have experience alerting issue i.e. notification was not received for alerts configured based on availability as well as metrics.

  • Root Cause: The failure was due to communication failures between two services which are responsible for alert rules and alerts input.
  • Lessons Learned: We understand the root caused completely and work has planned to avoid re-occurrence of this issue in future.
  • Incident Timeline:  2 Hours & 30 minutes - 02/07, 21:30 UTC through 02/08, 00:00 UTC

We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.


Update: Wednesday, 07 February 2018 23:57 UTC

We continue to investigate issues within Application Insights. Root cause is not fully understood at this time. Customers continue to experience alerting related issue i.e. no email notification will be received for set alerts. We are working to establish the start time for the issue, initial findings indicate that the problem began at 02/07 ~21:30 UTC. We currently have no estimate for resolution.
  • Next Update: Before 02/08 03:00 UTC



Initial Update: Wednesday, 07 February 2018 22:22 UTC

We are aware of issues within Application Insights and are actively investigating. Customers may not receive alerting emails based on availability tests. We provide more information as we learn.
  • Work Around: Customers may use azure portal to view failures and success in availability charts.
  • Next Update: Before 02/08 00:30 UTC

We are working hard to resolve this issue and apologize for any inconvenience.
-Arvind


Skip to main content