Experiencing Data Latency for Many Data Types – 04/25 – Resolved


Final Update: Tuesday, 26 April 2016 02:50 UTC

We’ve confirmed that all systems are back to normal with no customer impact as of 4/26 02:30 UTC. Our logs show the incident started on 4/25, 10:00 UTC and that during the 16 hours and 30 that it took to resolve the issue customers experienced multiple windows of data latency.
  • Root Cause: The failure was due to multiple slowdowns in a dependent service. We are doing further root cause in order to make our services more resilient in the future.
  • Lessons Learned: We continue to work and research ways to make our system more resilient so that this does not happen as often in the future.
  • Incident Timeline:  16 Hours & 30 minutes – 4/25, 10:00 UTC through 4/26, 02:30 UTC

We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.

-Steve


Update: Monday, 25 April 2016 21:47 UTC

We ran into further slowdowns which has extended the time in which the processing of the backlog will take. We are looking into ways to resolve the slowdowns that we are experiencing. Customers will continue to see data gaps that are outside of SLA. Current data is within the 2 hour SLA. 
  • Work Around: None
  • Next Update: Before 04/26 04:00 UTC

-Steve


Update: Monday, 25 April 2016 15:47 UTC

During our mitigation process, we came across a re-occurrence of the issue which slowed down the overall mitigation process. We have taken the steps to mitigate the issue. Current Data processing is working as expected. Some customers may experience data gaps outside of 2 hour SLA  and we estimate another 6 hours before all the backlog is processed.
  • Work Around: None
  • Next Update: Before 04/25 22:00 UTC

-Varun


Update: Monday, 25 April 2016 12:14 UTC

Root cause has been isolated to slow processing in a back-end service which was impacting multiple data types. We have taken the steps to mitigate the issue. Current Data processing is working as expected. Some customers may experience data gaps outside of 2 hour SLA for the data ingested between 10:00 AM and 11:00 AM UTC and we estimate 4 hours before all the backlog is processed.
  • Work Around: None
  • Next Update: Before 04/25 16:30 UTC

-Varun


Skip to main content