Azure Container Registry Global Webhooks–with a helping of Azure Durable Functions

I spend quite a lot of my time working with containerised solutions in Azure, so I make use of Azure Container Registry (ACR). ACR has a couple of features that I really like: webhooks and geo-replication. With ACR, webhooks are not global (i.e. they are fired for each replicated region). Sometimes this is what you want, but other times you might want to have a webhook that is invoked when the push operation has completed across all replicated regions. In this post I will show how you can build that relatively easily with another technology that I like: Azure Durable Functions.

Background

Before digging in to the solution, let's take a quick look at the different pieces involved and how they relate to this sceNario.

ACR webhooks

With webhooks in ACR you can register for HTTP notifications when images are pushed (or deleted).

An example usage of webhooks is to use Brigade to respond to new images pushed to your registry to kick of a deployment of the new version (or something like that): example.

Geo-replication

If you're deploying your services in Azure then a definite benefit of ACR is that your images are network-close to where you are deploying them. If you are deploying services across multiple regions then having your images in each of those regions preserves this network-locality benefit. Whilst you could deploy separate ACR instances in each region and update your build to push to each region, ACR lets you have one logical registry instance that automatically replicates images across the regions that you choose. This is pretty cool - you get a single registry endpoint that you use to push/pull and the images automatically replicated across your chosen regions. When you push/pull you are directed to the nearest replicated instance for performance.

However, as mentioned earlier the webhooks are fired per-region. For some scenarios this is desirable but for others you might want to get a notification once the image that you've pushed has been replicated across all the regions you have configured. To achieve this with minimal effort (and cost) we'll use Azure Durable Functions...

Azure Durable Functions

Azure Durable Functions build on top of Azure Functions and add an orchestration/coordination layer. There is some great documentation that shows some of the common patterns that you can use Durable Functions for. The pattern that we will use here is the External Event pattern (also referred to as Human Interaction in some of the documentation scenarios).

Durable Functions just hit General Availability for C# on Functions v1, so for this post I will show C#. There is preview support for JavaScript Durable Functions on Functions v2 with support for more languages coming.

Creating the solution

The first part of the solution will be an HTTP-triggered function. This will be the function that subscribes to the ACR webhook.

We will also have an orchestration function. This will be responsible for determining whether we have had notifications from all of the regions. You can have multiple instances of an orchestration, each having its own set of state. We will have an instance of the orchestration per set of regional notifications. The id property of the webhook body is the id for the logical operation for the registry, i.e. it is the same across regional webhook invocations for the same image push/delete, so we will use this as the instance id for our orchestration.

Each invocation of the HTTP-triggered function will get a reference to the orchestration instance for the for creating the orchestration instance if required. (Typically an orchestration instance is created without specifying the ID, but in this case we want to use the ID from the webhook notification to allow us to route notifications to the correct instance.) Once the trigger function has the reference to the orchestration instance it will raise an external event for the region that the notification corresponds to. In this way the trigger function translates the webhook notification for each region into a Durable Function external event for each region.

Inside the orchestration function we can call WaitForExternalEvent to wait for the event for each region. This is an async method so returns a Task. Rather than awaiting these Tasks we capture them all and pass them to Task.WhenAll which gives us back a new Task that represents all of the external event Tasks completing. By awaiting this new Task we handle the join or fan-in over the notifications.

Show me the code

All of this is handled in the sample code for the blog post.

I named the HTTP-triggered function "ImagePush"

[FunctionName("ImagePush")]
public static async Task<HttpResponseMessage> ImagePush(
    [HttpTrigger(AuthorizationLevel.Function, "get", "post")]
    HttpRequestMessage request,
    [OrchestrationClient]
    DurableOrchestrationClient starter,
    TraceWriter log)

Notice the DurableOrchestrationClient that is passed in. This is a special type for use in Durable Functions and lets us create and query orchestration instances, as well as raise external events against them (as we'll see later).

The HttpRequestMessage parameter gives us access to the HTTP Request that triggered the function (in our case the webhook notification). When I registered the ACR webhooks I simply added a querystring parameter to the function URL to identify which region the webhook is firing for. The following code retrieves this:

var query = request.RequestUri.ParseQueryString();
var region = query["region"];

Similarly, the request is used to read the body of the HTTP Request to retrieve the notification id. From there, we look up the orchestration instance and create it if required:

var status = await starter.GetStatusAsync(instanceId);
if (status == null)
{
    await starter.StartNewAsync("WaitForAllRegions", instanceId, notification);
}

And then finally raise the event for that region that the webhook fired for

await starter.RaiseEventAsync(instanceId, GetEventName(region), null);

Now that we've seen the core parts of the trigger function , let's turn to the orchestrator function. It has a pretty simple declaration:

[FunctionName("WaitForAllRegions")]
public static async Task WaitForAllRegions([OrchestrationTrigger] DurableOrchestrationContext context)

This time we have a Durable Functions orchestration context that allows us to call into activity functions or wait for external events. In our case we want to wait for an external event for each region that we're replicated to. The orchestration context has a WaitForExternalEvent method that allows us to wait for an external event by awaiting the Task it returns. In our case, we won't await the Task - instead we will build up a list of Tasks (one per region). We can then take this List and pass it to Task.WhenAll. This gives us a single Task that represents all of the Tasks in the List completing.

var eventTasks = new List<Task>();
foreach (var region in GetRegions())
{
    eventTasks.Add(context.WaitForExternalEvent<object>(GetEventName(region)));
}
Task replicationCompletedTask = Task.WhenAll(eventTasks);
await replicationCompletedTask

By awaiting the replicationCompletedTask we won't continue with any further code until that Task has completed. Since that was created using Task.WhenAll for the Tasks for the external events, that corresponds to all the external events having been raised.

The last step is to take some action at this point, so we call into an activity function:

await context.CallActivityAsync<string>("FirePushNotification", notification);

The activity function is just a placeholder function in the sample code, but you could use this function to make an HTTP call to notify another system, post to teams/slack, ...

[FunctionName("FirePushNotification")]
public static string FirePushNotification([ActivityTrigger] WebHookNotification notification, TraceWriter log)
{
    log.Info($"TODO: All regions are sync'd - add whatever onward notification you want here! Repository: {notification.Target.Repository}, Tag: {notification.Target.Tag}, Id: {notification.Request.Id}");
    return $"All regions are sync'd!";
}

What's with all the Tasks?

I really like how the Durable Functions runtime exposes Tasks as the interface to the units of work (calling activity functions, waiting for external events) as it means that we can bring in experience of working with Tasks from other contexts. We can use await and Task.WhenAll that are familiar from other async programming in C#.

SIDE NOTE:

It is important to note that while these give the illusion that the orchestration function will keep running and wait for the potentially long-running Tasks to complete, that's not actually the case. When the orchestrator reaches an idle point where it is blocked waiting on Tasks, the underlying runtime suspends it, which is particularly cool for the consumption billing plan of Azure Functions - if it's not running then you're not paying! With this coolness, you do need to be aware that the orchestrator will be replayed when a new notification has come in and it needs to resume. The documentation covers how replay works and the constraints this puts on orchestration functions.

Adding another layer

In my experience with the ACR geo-replication it has been rock-solid. However, in the interests of exploring Durable Functions a little further, let's add in a requirement to be notified if we don't get all of the notifications within a specified time period. Let's use 1 minute for this, but we could have picked 10 seconds or 3 days :-)

The code we have so far uses the orchestration context to get Tasks for various external event notifications, and then uses Task.WhenAll to join multiple notifications together to create a single Task that represents all the notifications completing. Let's call this the "completion Task".

One of the other things we can do with the orchestration context is to create a timer, and hopefully it won't come as a surprise that this returns a Task that completes when the timer fires. Now we have the completion Task and the timer Task, and we can use the Task.WhenAny method to combine these two Tasks into a new Task that completes when *either* of the input tasks completes, i.e. Either when the timer fires, or when the notifications have all completed. Hey presto, we have now implemented the timeout!

In code, we create the timeout Task by using the orchestration context. We use the CurrentUtcDateTime property to calculate the timer expiry as it will give us a consistent DateTime value even if the orchestration function is being replayed.

Task timeoutTask = context.CreateTimer(context.CurrentUtcDateTime + timeOutInterval, cts.Token);

Then we combine the timeoutTask with the replicationCompletedTask we had before:

var winningTask = await Task.WhenAny(timeoutTask, replicationCompletedTask);

Once we've returned from this await, winningTask will be set to whichever task has completed. We can then test whether this is the timeoutTask or the replicationCompletedTask to determine which action to take.

Trying it for yourself

All of the code for this post is available on github (https://github.com/stuartleeks/acr-global-webhook). There are instructions in the repo README for deploying and configuring.

Find out more

If you want to find out more about Durable Functions there is some great documentation:

For more information on Azure Container Registry see