Long-running operations (e.g. Create lab, start VM, delete custom image) are being delayed by up to 2 hours. The impact appears to be limited to the following regions: Australia East, Brazil South, Canada East, East US, East US 2, Japan West, Korea South, North Central US, South India, Southeast Asia, UK South, West Europe, West US 2.
The incident seemed to start at around 14:00 UTC. The delay in processing was resolved at 21:25 UTC.
It was caused by a spike in long-running requests that took over 6 hours to process. Meanwhile, new operations coming in would not have been processed immediately, resulting in delays noticed by customers.
The team will deploy more workers to handle the load, and work on tweaking settings to ensure each worker improve throughput.