Experiencing Data Latency for Many Data Types – 7/3 – Resolved


Final Update: , 7/5/2015 04:53 UTC

We have confirmed that all systems are back to normal with no customer impact as of 7/5, 04:50 UTC. Our logs show the incident started on 7/3/ 12:00 UTC and that during the 41 hours that it took to resolve the issue, a subset of our customers experienced data latency and missing data for all the data streams. We’ve completed reprocessing of the missing data and customers should now be able to view all their data without any gaps.

Root Cause: Root cause has been isolated to complications during upgrade activities which resulted in downtime for our backend storage service.
Lessons Learned: Based upon our analysis of the incident and the improvements being made in our upgrade systems and processes, we see minimal chances for reoccurrence of similar incidents.
Incident Timeline:  40 Hours & 53 minutes – 7/3/ 12:00 UTC through 7/5/ 04:53 UTC

We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.

-Application Insights Service Delivery Team


Update: Saturday, 7/4/2015 16:06 UTC

We are working on reprocessing missing data. Some customers may still experience data gap between 07/02 22:30 UTC and 7/3 14:30 UTC and we estimate 48 hours before all missing data is reprocessed and is available in the portal.

Work Around: Utilize current data.
Next Update: Before 07/05 16:00 UTC

-Application Insights Service Delivery Team


Update: Friday, 7/3/2015 14:49 UTC

Root cause has been isolated to complications during upgrade activities which were impacting data latency for multiple data streams. Application Insights data storage service is now working as expected and customers will be able to see all their current data starting 7/3 14:30 UTC. Some customers may experience data gap between 07/02 22:30 UTC and 7/3 14:30 UTC and we estimate 72 hours before all missing data is reprocessed and is available in the portal.

Work Around: Utilize current data.
Next Update: Before 07/04 15:00 UTC

-Application Insights Service Delivery Team


Update: Friday, 7/3/2015 07:15 UTC

Our DevOps team continues to investigate issues within Application Insights. Root cause is not fully understood at this time. Some customers continue to experience data latency for many data types. We are working to establish the start time for the issue, initial findings indicate that the problem began at 07/02 ~22:30 UTC. We currently have no estimate for resolution.

Work Around: none
Next Update: Before 07/03 15:00 UTC

-Application Insights Service Delivery Team


Initial Update: Friday, 7/3/2015 06:44 UTC

We are aware of issues within Application Insights and are actively investigating. Some customers may experience Data Latency. The following data types are affected: Customer Event, Dependency, Exception, Metric, Page Load, Page View, Performance Counter, Request.

Work Around: none
Next Update: Before 07/03 9:00 UTC

We are working hard to resolve this issue and apologize for any inconvenience.

-Application Insights Service Delivery Team

 
 
 
 

Skip to main content