Application Insights team has identified an issue in alerting service that may have impacted up to one third of our total customers. This issue was caused by a configuration change & has been fixed by hotfix deployment. At this moment , all of our services are healthy & running as expected.However during the impacted window (2/8/2016 18:00 – 2/8/2016 23:30 UTC) , customers may not have received alerts for configured metrics.
- Root Cause: The failure was due to configuration change.
- Lessons Learned: Our team is investigating additional improvements to internal telemetry and change management process to avoid re-occurrence of similar issue.
- Incident Timeline: 5 Hours & 30 minutes – 2/8, 18:00 UTC through 2/8, 23:30 UTC
We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.
-Application Insights Service Delivery Team