Handling Errors from Windows Azure Active Directory Graph Service

All client applications should implement some level of error handling logic to react as gracefully as possible to various conditions and provide the best experience possible to their customers. This post covers a variety of error conditions that can be returned by the Azure Active Directory Graph Service and explains how each of them can be handled. Note that this document does not cover error handling of the Windows Azure Access Control Service (ACS) or other related services.

Errors can be categorized into a few types:

- Client errors such as providing invalid values when creating an object.

- Server errors such as a transient directory failure.

- Network/protocol errors such as a host name resolution failure.

HTTP Client Errors

Client errors fall into the range of 4xx HTTP status codes. Examples of such errors are attempting to access a non-existent resource, attempting to create an object without a required property value, attempting to update a read-only property, not including the required authorization token, etc. Such requests should not be retried without first fixing the underlying issue.

HTTP Server Errors

Server errors fall into the range of 5xx HTTP status codes. Some of these errors are transient and can be resolved upon retrying, and some cannot be.

Network/Protocol Errors

A variety of network-related errors can occur while sending a request or receiving a response. Examples are host name resolution errors, the connection being closed prematurely, SSL negotiation errors, etc. Most of these errors will not resolve themselves without the underlying issue being resolved; however, some errors such as host-name resolution failures or timeouts might be resolved upon retry.

Service Error Codes

Whenever an HTTP error response is returned by the service, it will typically be in a form like the following:

 

 HTTP/1.1 400 Bad Request
 Content-Type: application/json;odata=minimalmetadata;charset=utf-8
 request-id: ddca4a7e-02b1-4899-ace1-19860901f2fc
 Date: Tue, 02 Jul 2013 01:48:19 GMT
 …
 
 {
 "odata.error" : {
 "code" : "Request_BadRequest",
 "message" : {
 "lang" : "en",
 "value" : "A value is required for property 'mailNickname' of resource 'Group'."
 },
 "values" : null
 }
 }
 

 

There are a few notable items in the above response body:

- code: This can be read and processed similar to that of an exception, where the client application reacts differently depending on the code.

- message: This is a language/message tuple that represents an error message that can be read by a user.

- values: This is a collection of name/value pairs that can be used to provide more context about the nature of the failure – this will be discussed in a subsequent update to this article.

 NOTE: Proxy/gateway services can be involved as a request is routed from your client to the directory service, meaning that some HTTP responses might not contain the response body indicated above. In such cases, ensure your code responds as best as possible based on the HTTP status code alone.

The following table contains the set of error codes, along with the associated meaning of each error condition.

HTTP Status Code

Error Code

Details

Retryable

400

Request_BadRequest

A generic request failure has occurred. This can indicate that an invalid property value was included during resource creation, or that an unsupported query argument was specified, etc. Correct the request inputs and try again.

No

400

Request_UnsupportedQuery

A generic error indicating that an unsupported GET request has been performed. Correct the GET request inputs and try again.

No

400

Directory_ResultSizeLimitExceeded

The request cannot be fulfilled because there are too many results associated with it. This is a rare condition that will not affect the majority of clients. This condition will not succeed upon retry, as it pertains to the number of values that would be present in the response.

No

401

Authentication_MissingOrMalformed

The access token, specified as the Authorization header value, is either missing or malformed. It must be included to execute the request. The WWW-Authenticate response header will provide more details that can be used to obtain a valid token. Retry the request after correcting the access token.

No

401

Authorization_IdentityNotFound

The principal contained in the access token could not be found in the directory. The principal may have been deleted from the directory after the access token was obtained.

No

401

Authorization_IdentityDisabled

The principal contained in the access token is present in the directory, but is disabled. The principal’s account in the directory should first be enabled for authorization to succeed.

No

401

 Authentication_ExpiredToken

The token’s valid lifetime has been exceeded. Obtain the request token again and retry the request.

No

403

Authorization_RequestDenied

Indicates that the request has been denied due to insufficient privileges. For example, a non-administrative principal might not have permissions to delete a given resource.

No

403

Authentication_Unauthorized

A general request authorization failure indicating invalid or unsupported claims in the token. Obtain the request token again and retry the request.

No

403

Directory_QuotaExceeded

A directory quota has been exceeded. This can be caused due an excessive number of objects being present in the tenant, or having been created on-behalf of a specific principal. This can also occur if the number of values on a particular object having been exceeded. Increase the maximum allowed quota count for the tenant or principal, or reduce the number of values included in the create/update request.

No

404

Request_ResourceNotFound

 

The resource indicated by the URI in the request URI or body does not exist.

 

No

500

Service_InternalServerError

 

 

A general error indicating that an unexpected internal service problem has occurred. Depending on the nature of the problem and the request, a retry may succeed.

 

 

Yes

502

<ALL>

 

 

A server acting as a gateway/proxy encountered an error from another server during request processing. Retry the request after waiting a short time.

 

 

Yes

503

Request_ThrottledTemporarily

 

The request rate has exceeded the allowable limit. Wait and retry your request again later – note that you will typically have to wait longer for such request throttling conditions versus other 503 service-unavailable errors.

 

 Yes

503

<ALL>

A general service-unavailable error. This is typically transient and should be resolvable after waiting a short time and retrying.

Yes

 

Retry Logic

The following C# code snippet represents a possible request invocation and retry-handling approach. It uses a simple try/catch block that inspects the request exception to determine whether to retry or not. This will not handle all exception scenarios, but will handle most that are thrown by the .NET WebClient and WCF Data Services clients.

 

 ///<summary>
/// Invokes the specified web-method delegate and returns the result, including retrying using a simple 
/// exponential-backoff approach as appropriate.
///</summary>
///<typeparam name="T">The type of object that the delegate returns.</typeparam>
///<param name="retryableOperation">The web-method delegate.</param>
///<returns>The result of the method.</returns>
private static T InvokeWebOperationWithRetry<T>(Func<T> retryableOperation)
{
    // Baseline delay of 1 second
    int baselineDelayMillis = 1000;

    const int MaxAttempts = 4;
    Random random = new Random();
    int attempt = 0;

    while (++attempt <= MaxAttempts)
    {
        try
        {
            return retryableOperation();
        }
        catch (InvalidOperationException invalidOperationException)
        {
            if (attempt == MaxAttempts || !IsRetryableError(invalidOperationException))
            {
                throw;
            }

            int delayMillis =
                baselineDelayMillis + random.Next((int)(baselineDelayMillis * 0.5), baselineDelayMillis);
            Thread.Sleep(delayMillis);

            // Increment base-delay time
            baselineDelayMillis *= 2;
        }
    }

    // The logic above dictates that this exception will never be thrown.
    throw new InvalidOperationException("This exception statement should never be thrown.");
}

///<summary>
/// Indicates whether the request that resulted in the specified exception is retryable or not.
///</summary>
///<param name="exception">The exception.</param>
///<returns>
///<see langword="true"/> if the exception is retryable, and <see langword="false"/> otherwise.
///</returns>
private static bool IsRetryableError(Exception exception)
{
    // TODO: Log request-id response header value here, to improve debug-ability in case
    // a class of server errors occur that are difficult to resolve.
    Nullable<HttpStatusCode> httpStatusCode = null;

    DataServiceRequestException requestException = exception as DataServiceRequestException;
    if (requestException != null)
    {
        OperationResponse opResponse = requestException.Response.FirstOrDefault();
        httpStatusCode = opResponse != null 
            ? (HttpStatusCode)opResponse.StatusCode
            : (HttpStatusCode)requestException.Response.BatchStatusCode;
    }

    DataServiceClientException clientException = exception as DataServiceClientException;
    if (!httpStatusCode.HasValue && clientException != null)
    {
        httpStatusCode = (HttpStatusCode)clientException.StatusCode;
    }

    DataServiceQueryException queryException = exception as DataServiceQueryException;
    if (!httpStatusCode.HasValue && queryException != null)
    {
        httpStatusCode = (HttpStatusCode)queryException.Response.StatusCode;
    }


    if (!httpStatusCode.HasValue)
    {
        WebException webException = exception as WebException;
        if (webException != null ||
            (webException = exception.InnerException as WebException) != null)
        {
            HttpWebResponse httpWebResponse = webException.Response as HttpWebResponse;
            if (httpWebResponse == null)
            {
                // Example of a possible network-related retry condition; the set of network conditions
                // to retry on and the sleep duration associated with each can differ depending on the client.
                return webException.Status == WebExceptionStatus.NameResolutionFailure;
            }

            httpStatusCode = httpWebResponse.StatusCode;
        }
    }

    return httpStatusCode.HasValue
        && (httpStatusCode == HttpStatusCode.InternalServerError
            || httpStatusCode == HttpStatusCode.BadGateway
            || httpStatusCode == HttpStatusCode.ServiceUnavailable);
}

 

Questions and feedback are welcome as always.