Experiencing Data Latency & Data Access issue for Many Data Types - 9/25 - Resolved


Final Update: Monday, 9/28/2015 17:47 UTC

We’ve confirmed that all systems are back to normal with no customer impact as of 9/28, 08:00 UTC. Both historical and new data pipeline is healthy.

Root Cause:  under investigation  

We understand that customers rely on Application Insights as a critical service and apologize for any impact this incident caused.

-Application Insights Service Delivery Team


Update: , 9/27/2015 17:14 UTC

We are currently making progress in processing of historical data with all data types caught up except Event and Performance counter. We are closely monitoring the system health and will provide an update  as soon as recovery is complete.

Next Update: Before 09/28 18:00 UTC

-Application Insights Service Delivery Team


Update: Saturday, 9/26/2015 17:27 UTC

We are currently making progress in processing of historical data but it is taking longer than expected. We are closely monitoring the system health and will provide an update  as soon as recovery is complete.

Next Update: Before 09/27 18:00 UTC

-Application Insights Service Delivery Team


Update: Saturday, 9/26/2015 00:54 UTC

Root cause has been isolated to platform issue  which was impacting our processing and alert capability. To address this issue we performed a DR failover which helped to mitigate this issue. We are seeing no latency in processing current data but we do have some latency in processing historical data  which may take upto 12 hours before we can say that we are fully updated.

All alerting capability are fully recovered and customer should see no delay in notification.

Our team is monitoring the processing of historical data backlog and will provide an update as soon as the recovery is complete.

Next Update: Before 09/26 18:00 UTC

-Application Insights Service Delivery Team


Update: Friday, 9/25/2015 23:17 UTC

A large portion of our customers continue to experience impact due to data latency and potential delays in metric alerts.  Additionally, some customers experienced intermittent errors when loading reports in the portal however, these have cleared now. 

The root cause for these failures is still not fully understood and we continue to work with our Azure partners to investigate potential platform issues.

In an attempt to restore service we are performing a DR failover to a new service deployment for our affect components.  This will take approximately 2 hours. 

We want our customers to know that we are treating this with highest priority and apologize for the inconvenience.

Work Around: none at this time
Next Update: Before 9/26 02:00 UTC

-Application Insights Service Delivery Team


Update: Friday, 9/25/2015 17:58 UTC

We continue to investigate issue in Application Insights Services. We have engaged our Azure partner team to speed up the investigation as some of our services rely on Azure infrastructure and we are experiencing slowness in data processing at infrastructure level. This issue also impacted Export Service ; so customers who have on boarded export feature will see data latency in export data along with all other data types. 

There is another issue found lately in data query nodes due to this some customer may see data access issue. We don't have any ETA on fixing these issue but we provide frequent updates in this blog post as we progress.

 • Work Around: None
Next Update: Before 22:00 UTC

-Application Insights Service Delivery Team


Update: Friday, 9/25/2015 15:43 UTC

We are actively investigating an issue in our processing pipeline that is impacting a significant portion of our customers.  The issue is causing data latency for most types of telemetry and customers will notice delays and gaps in recent data being viewed in our dashboards.  Another impact is that metric alerts configured by our customers may be delayed due to this issue. 

We've isolated the source of the failure but still don't fully understand the root cause or mitigation options.  Our DevOps team continues to investigate this with the highest priority. 

Work Around: None at this time
Next Update: Friday, 9/25/2015 17:00 UTC (or sooner if we have new status to share)

Thanks,
-Application Insights Service Delivery Team


Initial Update: Friday, 9/25/2015 14:51 UTC

We are aware of issues within Application Insights and are actively investigating. Some customers may experience Data Latency. Multiple data types are affected.

Work Around: none
Next Update: Before 17:00 UTC

We are working hard to resolve this issue and apologize for any inconvenience.

-Application Insights Service Delivery Team

 

 

 
 
 
 
 
 
 

 

 

 

Skip to main content