Batching Data Service Requests


We have received a fair amount of feedback regarding a number of use cases where it would be beneficial to enable a client of a data service to “batch” up a group of operations and send them to the data service in a single HTTP request.  This reduces the number of roundtrips to the data service for apps that need to make numerous requests to the data service to perform a given action and allows a set of operations to be logically grouped together. 

Below is the design we have landed on.  Note: We will have something close to this design in our next CTP/Beta release of Astoria, but its not quite there yet.

Background

The base ADO.NET Data Services Framework semantics  provide two mechanisms to query and send updates to a data service.  From a high-level they are:

Query:

1) Send an HTTP GET request to a URI representing a resource (or set of resources) and receive in the response a representation of the resource (or set of resources). Example: a GET request to /Customers(1) returns a single customer entity in the response

2) Same as #1, but add the $expand query string operator to the request to request resources related to the resource(s) specified in the request URI be returned in the response as well.  Example: a GET request to /Products(1)?expand=Category, Parts returns product #1 as well as the Parts and Category associated the product in the response

Update:

1) Insert / update / delete a single resource per HTTP request by sending POST, PUT or DELETE requests to a data service

2) Insert a new resource and related resources in a single request. Example: a POST to /Customers can insert a Customer and related  Orders in a single request by inlining the related orders in the request body

 

Why do we need batching?

Now assume you have the following situation: Single “Save” button per page in my RIA line of business application: Contoso Solutions is building an online Silverlight-based order entry system for its salesforce. Any given sale requires a number of entities within the data service be inserted and/or updated. Some of the entities are associated via navigation properties while others have no relation to the other entities being acted on as part of processing the sale. The user experience contoso wants to enable is to paint the full order processing information on a single screen and include a “save changes” button at the bottom to persist all the updates made to create the order.

In this case, the $expand operation cannot be used to pull down all the data to paint the order entry screen in a single HTTP request.  Also, the update operations cannot easily be persisted as an atomic set of operations to the underlying data store.

 

Batching Design

To support batching, ADO.NET Data Services has added a new $batch URI which will accept batch requests and return batch responses.  Logically a batch request is a group of 0 or more QueryOperations and 0 or more ChangeSet operations.  QueryOperations are analogous to a simple “non batch” query request.  ChangeSet operations are just a group of unordered, atomic CUD (update,insert&delete) operations (ie. all operations succeed or none do).

Now that we have the logical model (Batch is a collection of ordered QueryOps and ChangeSets), we needed a wire representation.  After a bit of exploring various ways to represent batches in ATOM, JSON , etc, Yaron Goland pointed out to us there is already a well defined way to represent multiple HTTP requests in a single request using multipart/mixed MIME messages and the mime type application/http.  This turned out to be just what we needed and enables us to easily encapsulate binary and text based content in a request or response.  Also, it looks like using multipart/mime for batching has been explored (with pretty positive feedback) a number of times in the blogosphere, so perhaps we’ll all land on something generally applicable.    Instead of describing this, lets just look at an example. 

Example – Batch Request:

The example assumes the batch request is sent to a data service located at: http://foo.com/dataservice.svc

The Batch example contains the following operations (in order):

  • A Change Set which contains the following operations in order:
    • POST operation
    • PUT operation
  • A Query Operation
  • A Query Operation

Note:

  • Outer HTTP Request elements & batch boundaries are shown in blue
  • Query Operations are shown in green
  • Change Sets are shown in red

POST /dataservice.svc/$batch HTTP/1.1
Host: foo.com
Content-Type: multipart/mixed; boundary=batch(36522ad7-fc75-4b56-8c71-56071383e77b)

–batch(36522ad7-fc75-4b56-8c71-56071383e77b)
Content-Type: multipart/mixed; boundary=changeset(77162fcd-b8da-41ac-a9f8-9357efbbd621)
Content-Length: ###         

–changeset(77162fcd-b8da-41ac-a9f8-9357efbbd621)
Content-Type: application/http
Content-Transfer-Encoding:binary

POST /dataservice.svc/Categories HTTP/1.1
Host: foo.com
Content-Type: application/atom+xml;type=entry
Content-Length: ###

<?xml version=”1.0″ encoding=”utf-8″ standalone=”yes”?>
<entry xmlns:d=”
http://schemas.microsoft.com/ado/…”
       xmlns:m=”http://schemas.microsoft.com/ado/…/metadata”
       xmlns=”http://www.w3.org/2005/Atom”>
  …    
  <content type=”application/xml”>
    <d:CategoryName>Software</d:CategoryName>
    <d:Description d:null=”true” />
    <d:Picture d:null=”true” />
  </content>
</entry>
–changeset(77162fcd-b8da-41ac-a9f8-9357efbbd621)
Content-Type: application/http
Content-Transfer-Encoding:binary

PUT /Categories(5) HTTP/1.1
Host: foo.com
Content-Type: application/atom+xml;type=entry
If-Match: xxxxx
Content-Length: ###

<?xml version=”1.0″ encoding=”utf-8″ standalone=”yes”?>
<entry xmlns:d=”
http://schemas.microsoft.com/ado/…”
       xmlns:m=”http://schemas.microsoft.com/ado/…/metadata”
       xmlns=”http://www.w3.org/2005/Atom”>
  …
  <content type=”application/xml”>
    <d:CategoryID>5</d:CategoryID>
    <d:CategoryName>UpdateCategoryName</d:CategoryName>
  </content>
</entry>
–changeset(77162fcd-b8da-41ac-a9f8-9357efbbd621)–
–batch(36522ad7-fc75-4b56-8c71-56071383e77b)
Content-Type: application/http
Content-Transfer-Encoding:binary

GET /Categories(5) HTTP/1.1
Host: foo.com

–batch(36522ad7-fc75-4b56-8c71-56071383e77b)
Content-Type: application/http
Content-Transfer-Encoding:binary

Operation: GET /Categories(6)
Host: foo.com

–batch(36522ad7-fc75-4b56-8c71-56071383e77b)–

 

Batch Response

Now that we have seen what a request looks like, the response is pretty much the mirror image of the request (also uses multipart/mime), with a mime part containing the associated HTTP response for each operation in the batch request.  The exception to this rule is for responses to ChangeSets.  Since ChangeSets are atomic if an operation in the set fails, the response for the ChangeSet is a single HTTP response instead of a nested multipart/mixed collection of responses.

 

This is already getting a bit long, so I’ll cut this off here.  In a future post we’ll walk through an end to end request + response and talk about how to cross reference operations within a batch request.  What do you think so far?  Are we overlooking/missing something?

 

Mike Flasko

ADO.NET Data Services, Program Manager

 

This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.

Comments (15)

  1. Mark Baker says:

    Batching is a *really* bad idea for many reasons.  There are other less harmful ways to accomplish the same goal for at least some kinds of operations.

    See;

    http://www.imc.org/atom-protocol/mail-archive/msg10751.html

  2. John Panzer says:

    I love it — batching is in the air.

    Mark — I responded to that Atom thread explaining why the proposed solutions were insufficient, and did not see any response.  If there is another solution that meets the requirements, it would be great for someone to write it up:

    http://www.imc.org/atom-protocol/mail-archive/msg10760.html

  3. Brian Smith says:

    How do you POST multiple inter-linked items in a single changeset?

  4. Regarding Brian’s comment, if the items are interlinked and the request is a single cangeset, then why not use a regular HTTP POST or PUT (as applicable)?

  5. Mike Flasko says:

    Thanks for the feedback everyone…

    Regarding the comments of why not just use pipelining or some other approach, we’ll post some of the requirements (atomic group of operations without requiring state be kept, etc) to the AtomPub list for feedback which didnt seem covered by pipelining.  We hope to have that out in the coming weeks.  

    Regarding cross references: we’re thinking something pretty simple: each operation in the batch is associated with an index number (first op=0 and so on) and then a subsequent op can reference another (within its same changeset) by using $[index] as an alias for the URI of the referenced operation.  

  6. Hot Topics says:

    In their ongoing open&#160; design of ADO.NET Data Services, the team put out their thoughts and options

  7. We are very excited to announce that .NET 3.5 SP1 Beta 1 and Visual Studio 2008 SP1 Beta 1 are now available!

  8. I did a talk on Data Services at DevDays, Amsterdam last week and so I had to take a rather speedy look…

  9. Arun Boppudi says:

    HTTP/1.1 200 OK

    Content-Type: multipart/mixed; boundary=batch-response

    –batch-response

    Content-Id: <batch-1>

    Content-Type: application/atom+xml;type=entry

    Batch-Operation: POST /my/atompub/collection

    Host: example.org

    Batch-Status: 201 Created

    Location: /my/atompub/collection/entries/1

    ETag: "ABCDEFGH"

    <?xml version="1.0"?>

    <entry xmlns="…">…</entry>

    –batch

    Content-Id: <batch-2>

    Batch-Operation: DELETE /my/atompub/collection/entries/2

    Host: example.org

    Batch-Status: 412 Precondition Failed

    In the above sample Batch request, first two lines represent Batch headers.  From "–batch-response" onwards, will it come in the request body or in any other format?

  10. Arun Boppudi says:

    HTTP/1.1 200 OK

    Content-Type: multipart/mixed; boundary=batch-response

    –batch-response

    Content-Id: <batch-1>

    Content-Type: application/atom+xml;type=entry

    Batch-Operation: POST /my/atompub/collection

    Host: example.org

    Batch-Status: 201 Created

    Location: /my/atompub/collection/entries/1

    ETag: "ABCDEFGH"

    <?xml version="1.0"?>

    <entry xmlns="…">…</entry>

    –batch

    Content-Id: <batch-2>

    Batch-Operation: DELETE /my/atompub/collection/entries/2

    Host: example.org

    Batch-Status: 412 Precondition Failed

    In the above sample Batch request, first two lines represent Batch headers.  From "–batch-response" onwards, will it come in the request body or in any other format?

  11. Bart says:

    Can you please give some more detail on how to create a batch rest request with json?

    The code in this article is copy from the MSDN documentation.

    I need to update PlaylistItems from a Playlist. I can’t do that with deep updates, so need to do this with a batch, or create a custom method. I prefer to look at the batch request.

    I use Fiddler to test the request

  12. Gaurav says:

    How can I update an entity with two links in a single PUT request. I can update all teh data members of this entity except linked member. For updating just the linked member i used $link construct, but then how to update both links and data members togather?

    mail me at gaurav.vijaywargia@gmail.com

  13. Gaurav: You should be able to include the new link values directly in the PUT payload (assuming they point to single resources, not to sets) and the server should both update the scalar properties and rewire the links within the same operation. Did you try this?

    -pablo