Closures and asynchronous chaining in Office JavaScript

In this post, we’ll discuss JavaScript closures and how misunderstanding them can lead to seemingly unexpected behavior in some JavaScript API for Office methods. Along the way, we’ll establish some best practices for how to make multiple asynchronous requests that constitute a single logical operation – a common pattern in apps. After reading, app developers will have a solid understanding of the following:

  • What closures are.
  • How to use closures while avoiding difficult-to-debug pitfalls especially common among new JavaScript developers.
  • Which kind of JavaScript API for Office methods are most likely to cause closure confusion.
  • Why chaining multiple asynchronous calls in serial is a best practice.

JavaScript closures

JavaScript is, among other things, a functional language. This means that it, unlike more purely imperative languages like C/C++/Java, treats functions as so-called "first-class citizens" just like numbers and strings. It is possible to assign a function to a var, pass one as an argument to another function, and return a function with the return keyword. One can even declare a function inside another function, with all the scoping rules you’d expect!

This (insane) flexibility has implications which sometimes make a difference when using JavaScript API for Office. Suppose you’re using the File API in Word to get a paged representation of the contents of a document. You want to process each slice of the file in turn and write a message about it somewhere. You might write a function like this:

 function processFile(file) {
    // Iterate over each slice in the file and access each slice.
    for (var i = 0; i < file.sliceCount; i++) {

        file.getSliceAsync(i, function onGetSlice(result) {

            // If the call returns successfully, 
            // we have access to the Slice object.
            if (result.status == "succeeded") {

                // Process the slice and write a status message.
                var output = processSlice(result.value);
                write('Slice ' + i + ' of ' +
                    file.sliceCount + ': ' + output);
            }
        });
    }
}

The function onGetSlice is a callback to the getSliceAsync method, so that function will be invoked when the asynchronous call completes. Asynchronous methods in JavaScript API for Office function by placing work to be done into a queue to be processed in a separate thread, allowing synchronous JavaScript code to continue executing while waiting. This means that it is likely that the for loop will have terminated long before even the first call to getSliceAsync completes. How then can the anonymous callback hope to pass the value of the loop variable i to the write function?

The answer is the concept known as a closure. When the JavaScript engine parses onGetSlice as it calls getSliceAsync, it notices that the variable i is required for onGetSlice to execute properly. It adds a reference to i to the closure of onGetSlice, so i doesn’t get garbage collected when the loop terminates.

Note that it is a reference to i that gets included in the closure, not a copy of its value at the time the closure is calculated. So, the loop terminates when i becomes equal to file.sliceCount, and i still has that value when onGetSlice is invoked. Suppose the file has five slices: then the output will say "Slice 5 of 5: …" for every slice!

Luckily, the solution to this problem is fairly simple. We can store the value contained by i while the loop executes with only a minor change to processFile:

 function processFile(file) {
    // Iterate over each slice in the file and access each slice.
    for (var i = 0; i < myFile.sliceCount; i++) {

        myFile.getSliceAsync(i, (function (index) {
            return function onGetSlice(result) {

                // If the call returns successfully, 
                // we have access to the Slice object.
                if (result.status == "succeeded") {

                    // Process the slice and write a status message.
                    var output = processSlice(result.value);
                    write('Slice ' + index + ' of ' +
                        file.sliceCount + ': ' + output);
                }
            }
        })(i));
    }
}

Remember that functions are "first-class citizens" in JavaScript. This allows us to create a new, anonymous function, which takes a single parameter (the index of the slice) and returns the function onGetSlice. We declare and invoke the anonymous function inline, passing it i as an argument. This returns an instance of onGetSlice with a closure including index, not i. index is a new and distinct variable from i and will store the value of i when the anonymous function was invoked inside the loop. It is that returned function, onGetSlice, which gets passed to getSliceAsync as the callback. With this change, we have dynamically built a function, and the output will be as intended.

Asynchronous chaining

Probably, that is. One other interesting fact is in play. There is no guarantee that asynchronous JavaScript API for Office functions will be executed in the order they were called. The JavaScript API for Office engine within the host apps is threaded, so the scheduler might “optimize” processing order. In many cases, the order in which requests are served doesn’t matter. But there is also a limit on concurrent requests, and additional ones will throw exceptions. For both these reasons, it is best to chain related asynchronous requests in serial, waiting for request N to finish before starting request N+1, rather than kicking them off in parallel as in the above for loop.

In this example, we don’t know how many slices we’ll need to process until runtime, so any chaining solution should support an arbitrary slice count. A recursive solution is the best option for accomplishing this:

 function processFileAsync(file, onComplete) {

    // A recursive helper function obtains and processes the indexth slice.
    function getAndProcessSlice(index) {

        // Base case: we've reached the end the slices, so stop processing.
        if (index == file.sliceCount) {
            onComplete();
        }
            // Recursive case:
        else {
            // Get the indexth slice
            myFile.getSliceAsync(index, function (result) {

                // If the call returns successfully, 
                // we have access to the Slice object.
                if (result.status == "succeeded") {

                    // Process the slice and write a status message.
                    var output = processSlice(result.value);
                    write('Slice ' + index + ' of ' + 
                        file.sliceCount + ': ' + output);
                }

                // Move on to the next slice.
                getAndProcessSlice(index + 1);
            });
        }
    }

    // Begin the recursive process with the first slice.
    getAndProcessSlice(0);
}

The function has been renamed and now has an additional parameter: an onComplete callback. The recursive function getAndProcessSlice processes a single slice (if there is a slice left to be processed), and then calls itself to work on the next slice. This way, we are guaranteed to process the slices in order. When all slices have been processed, onComplete is called. processFileAsync defines getAndProcessSlice as an inner function and then invokes it, starting at the beginning of the slice collection. (Note that getAndProcessSlice could have been defined and invoked in a single line, just as with the anonymous function in the iterative solution to the closure problem.)

Don’t be misled: the iterative solution from before was asynchronous too. But note how much easier it is to see this (and to code an elegant asynchronous solution) with the recursive approach. The reader can imagine too that it is easier to implement a retry clause in case of failure this way. Moreover, in addition to the memory and time we save by not creating a new function and a new index for each iteration of the earlier solution, we’ve entirely avoided the closure issue! Try to chain requests when there is a series of related processing to do (as, for example, when making the same asynchronous call on each member of a collection of Bindings, File Slices, or CustomXML Parts) or when there is a risk of too many concurrent requests if made in parallel.