Description of the Microsoft Office Existence Discovery Protocol

Summary

When opening documents from a URL location in Microsoft Office 2007, the Office library can make an HTTP HEAD request to the web server for the opening URL.  This request is sent with a User-Agent set to"Microsoft Office Existence Discovery".  This call is new to Office 2007.

More Information

The purpose of the HEAD request is to check that the content exists at the URL location as a document, and not simply as a tempoary resource streamed down for a read-only session.  The call will also attempt to obtain the last modified time of the content as returned by the web server in the HEAD response.  This information is cached and used by client in situations where an editing lock may be lost and need to be reacquired.  For example, if the laptop/computer is unplugged from the network and reattached, or goes into hibernation/low power mode and then resumed, the Office application may attempt to reconnect to the resource and relock it for editing if the modified timestamp returned by the server does not change from the time it was last opened until the time of the reacquire attempt.  If the session can reconnect, editing can continue and save the file as normal. If not, the user will be altered that the file has changed and can save their changes in a new file instead (or discard their changes and reload the file again with the new content).

The protocol is made in the following format:

HEAD /SomFolder/SomeFile.doc HTTP/1.1
User-Agent: Microsoft Office Existence Discovery
Host: SomeServer.com
Content-Length: 0
Proxy-Connection: Keep-Alive
Pragma: no-cache

The response to a file existing as a document on the server would normally include the last modified time, such as this:

HTTP/1.1 200 OK
Content-Length: 70656
Content-Type: application/msword
Last-Modified: Thu, 28 Jun 2007 17:38:29 GMT
Server: Microsoft-IIS/6.0
Date: Tue, 11 Mar 2008 21:02:27 GMT

If the web server does not provide the last modified header in the HEAD response, the information is left empty in the document, so if a network disconnect occurs, the file cannot be automatically reconnected for editing. 

This call occurs on all URL open attempts, even if editing is not requested per se.  As a result it is possible that the extra web call (made from the process space of the Office application in its network session and not the web browser in a separate session) can cause some users to see extra prompts to authenticate (401) or loss of session state and an unnecessary redirection (302) to a login page or other feedback form. This is expected behavior. If web sites wish to avoid such issues because they intend the documents to open read-only without editing rights, they can address the issue using one of the techniques discussed in the following KB article:

KB899927: You are redirected to a logon page or an error page, or you are prompted for authentication information when you click a hyperlink to a SSO Web site in an Office document

The typical workaround is use the Content-Disposition: Attachment header in the GET response when returning the file. This header will tell the web browser to treat the file as a download (read-only), so the file will open in Office from the web browser cache location instead of a URL. With that setting, the Office application will treat the file as local, and will therefore not make calls back to the web server.