Experiencing Data Latency for Many Data Types – 04/18 – Resolved


Final Update: Tuesday, 19 April 2016 01:46 UTC

We’ve confirmed that all systems are back to normal with no customer impact as of 04/19, 00:40 UTC. Our logs show the incident started on 04/18, 14:07 UTC and that during the 10 hours that it took to resolve the issue customers experienced multiple data gaps outside of the 2 hour SLA.
  • Root Cause: The failure was due to bug that was found in our backend service which drastically reduced processing when affected.
  • Lessons Learned: We have scaled out our backend as well as taken other mitigation steps in order to increase processing. We are also working on a permanent fix.
  • Incident Timeline: 10 hours – M/D, 04/18, 14:07 UTC through 04/19, 00:40 UTC

We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.

-Randy


Update: Monday, 18 April 2016 20:23 UTC

Root cause has been isolated to increased pressure on our backend service which was impacting multiple data types. To address this issue we scaled out and rebooted the effective service. Multiple data types is now working as expected. Some customers may experience data gaps outside of the 2 hour SLA. The processing of the backlog of data is taking longer than originally anticipated. We are looking into ways to speed up the processing of the backlogged data.
  • Next Update: Before 04/19 08:30 UTC

-Randy


Update: Monday, 18 April 2016 16:43 UTC

Root cause has been isolated to slow processing in a backend service which was impacting multiple data types. To address this issue we rebooted the affected service. Multiple data types are now working as expected. Some customers may experience data gaps outside of 2 hour SLA and we estimate 4 hours before all the backlog is processed and all data is within SLA.
  • Next Update: Before 04/18 21:00 UTC

-Randy


Skip to main content