Data Services Streaming Provider Series: Implementing a Streaming Provider (Part 1)

The Open Data Protocol (OData) enables you to define data feeds that also make binary large object (BLOB) data, such as photos, videos, and documents, available to client applications that consume OData feeds. These BLOBs are not returned within the feed itself (for obvious serialization, memory consumption and performance reasons). Instead, this binary data, called a media resource (MR), is requested from the data service separately from the entry in the feed to which it belongs, called a media link entry (MLE) . An MR cannot exist without a related MLE, and each MLE has a reference to the related MR. (OData inherits this behavior from the AtomPub protocol.) If you are interested in the details and representation of an MLE in an OData feed, see Representing Media Link Entries (either AtomPub or  JSON) in the OData Protocol documentation.

To support these behaviors, WCF Data Services defines an IDataServiceStreamProvider interface that, when implemented, is used by the data service runtime to access the Stream that it uses to return or save the MR.  

What We Will Cover in this Series

Because it is the most straight-forward way to implement a streaming provider, this initial post in the series demonstrates an IDataServiceStreamProvider implementation that reads binary data from and writes binary data to files stored in the file system as a FileStream. MLE data is stored in a SQL Server database by using the Entity Framework provider. (If you are not already familiar with how to create an OData service by using WCF Data Services, you should first read Getting Started with WCF Data Services and the WCF Data Service quickstart in the MSDN documentation.) Subsequent posts will discuss other strategies and considerations for implementing the IDataServiceStreamProvider interface, such as storing the MR in the database (along with the MLE) and handling concurrency, as well as how to use the WCF Data Services client to consume an MR as a stream in a client application.

Steps Required to Implement a Streaming Provider

This initial blog post will cover the basic requirements for creating a streaming data service, which are:

  1. Create the ASP.NET application.
  2. Define the data provider.
  3. Create the data service.
  4. Implement IDataServiceStreamProvider .
  5. Implement IServiceProvider.
  6. Attribute the model metadata.
  7. Enable large data streams in the ASP.NET application.
  8. Grant the service access to the image file storage location and to the database.

Now, let’s take a look at the data service that we will use in this blog series.

The PhotoData Sample Data Service

This blog series features a sample photo data service that implements a streaming provider to store and retrieve image files, along with information about each photo. The following represents the PhotoInfo entity, which is the MLE in this sample data service:

image

This data is stored in a single PhotoInfo table in a database, which is accessed by using the Entity Framework provider. The following code defines the data service:

// This method is called only once to initialize service-wide policies.
public static void InitializeService(DataServiceConfiguration config)
{
config.SetEntitySetAccessRule("PhotoInfo",
EntitySetRights.ReadMultiple |
EntitySetRights.ReadSingle |
EntitySetRights.AllWrite);

config.DataServiceBehavior.MaxProtocolVersion =
DataServiceProtocolVersion.V2;
}

You can download the complete source code for this sample data service (and client application) from the MSDN Code Gallery.

Implementing IDataServiceStreamProvider

The WCF Data Services runtime relies on the IDataServiceStreamProvider implementation to get and set the stream that contains MR data. The methods of IDataServiceStreamProvider that are used to get and set the MR stream are GetReadStream and GetWriteStream, each of which return a Stream .

Returning a Data Stream: the GetReadStream Method

The GetReadStream method is called by the data service runtime to get the stream that contains the MR that it returns to the requesting client. The entry parameter supplied by the runtime is the MLE, and data from this entity is used to retrieve the related MR data. In our implementation, the image file is retrieved from the app_data directory by using a file name that is based on the key property of the supplied entity, as seen below:

public Stream GetReadStream(object entity, string etag, bool?
checkETagForEquality, DataServiceOperationContext operationContext)
{
if (checkETagForEquality != null)
{
// This stream provider implementation does not support
// ETag headers for media resources. This means that we do not track
// concurrency for a media resource, and last-in wins on updates.
        throw new DataServiceException(400,
"This sample service does not support the ETag header for a media resource.");
}

    PhotoInfo image = entity as PhotoInfo;
if (image == null)
{
throw new DataServiceException(500, "Internal Server Error.");
}

    // Build the full path to the stored image file, which includes the entity key.
string fullImageFilePath = imageFilePath + "image" + image.PhotoId;

    if (!File.Exists(fullImageFilePath))
{
throw new DataServiceException(500, "The image file could not be found.");
}

    // Return a stream that contains the requested file.
return new FileStream(fullImageFilePath, FileMode.Open);
}

In this implementation, a FileStream is created with read access to the stored image file, and this stream is returned by the data service. In this version of the photo service sample, we are not checking for concurrency in the MR, so we don’t need to worry about the etag or checkETagforEquality parameters. The operationContext parameter provides access to information about the request, including message headers.

Storing the Media Resource from a Filestream: the GetWriteStream Method

As you would expect, the GetWriteStream method is called by the data service to get a stream that is used to store a request from the client that contains an MR. There are two kinds of requests for which the data service uses this implementation:

  • HTTP POST – when a new MR is inserted, along with its associated MLE.
  • HTTP PUT – when the MR and MLE already exists, but the MR data is being replaced.

The GetWriteStream method implementation must support both of these write operations. While the HTTP PUT case turns out to be fairly straightforward, the HTTP POST operation is a bit more complex, for the following reasons:

  1. A protocol requirement (inherited from AtomPub) that the MR be created before the associated MLE.
  2. Maintaining transactional integrity between creation of the MR and the MLE.

Because of these complexities, it is helpful to understand the process by which the data service runtime handles a POST request. A new MR/MLE is created by the following process:

  1. The client sends a POST request that contains only the media resource; such a request looks like the following:

    POST /PhotoService/PhotoData.svc/PhotoInfo HTTP/1.1
    User-Agent: Microsoft ADO.NET Data Services
    DataServiceVersion: 1.0;NetFx
    MaxDataServiceVersion: 2.0;NetFx
    Accept: application/atom+xml,application/xml
    Accept-Charset: UTF-8
    Content-Type: image/jpeg
    Slug: myphoto.jpg
    Host: localhost
    Transfer-Encoding: chunked
    Expect: 100-continue

    <<Binary media resource data…>>

  2. The data service creates a new entity object with null properties and calls the GetWriteStream method, passing this object to the entry parameter.
    Note: When using the Entity Framework provider, your implementation must set any required properties of this entity instance, which can be done either by using data from the slug message header or by setting default values. In our implementation, we use the contents of the Slug header to set the FileName property.

  3. The data service writes the binary data received in the POST request to the stream returned by the GetWriteStream implementation.

  4. The data service inserts the new entity into the data source. (Since we are storing PhotoInfo object data in a database, the Entity Framework provider is used to insert the row.)

  5. The data service calls IDisposable.Dispose (if implemented) on the IDataServiceStreamProvider instance.

  6. The data service returns a response that includes the newly created MLE, including any server-generated values such as identities. A response to the POST request in step 1 looks like the following:

    HTTP/1.1 201 Created
    Cache-Control: no-cache
    Content-Length: 2048
    Content-Type: application/atom+xml;charset=utf-8
    Location: https://localhost/PhotoService/PhotoData.svc/PhotoInfo(4)
    Server: Microsoft-IIS/7.0
    X-AspNet-Version: 4.0.30319
    DataServiceVersion: 1.0;
    X-Powered-By: ASP.NET
    Date: Tue, 27 Jul 2010 06:51:27 GMT

    <?xml version="1.0" encoding="utf-8" standalone="yes"?>
    <entry xml:base=https://localhost/PhotoService/PhotoData.svc/
    xmlns:d=https://schemas.microsoft.com/ado/2007/08/dataservices
    xmlns:m=https://schemas.microsoft.com/ado/2007/08/dataservices/metadata
    xmlns="https://www.w3.org/2005/Atom">
    <id>https://localhost/PhotoService/PhotoData.svc/PhotoInfo(4)</id>
    <title type="text"></title>
    <updated>2010-07-27T06:51:27Z</updated>
    <author>
    <name />
    </author>
    <link rel="edit-media" title="PhotoInfo" href="PhotoInfo(4)/$value" />
    <link rel="edit" title="PhotoInfo" href="PhotoInfo(4)" />
    <category term="PhotoData.PhotoInfo" scheme="https://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
    <content type="image/jpeg" src="PhotoInfo(4)/$value" />
    <m:properties xmlns:m=https://schemas.microsoft.com/ado/2007/08/dataservices/metadata
    xmlns:d="https://schemas.microsoft.com/ado/2007/08/dataservices">
    <d:PhotoId m:type="Edm.Int32">4</d:PhotoId>
    <d:FileName>myphoto.jpg</d:FileName>
    <d:FileSize m:type="Edm.Int32" m:null="true"></d:FileSize>
    <d:DateTaken m:type="Edm.DateTime" m:null="true"></d:DateTaken>
    <d:TakenBy m:null="true"></d:TakenBy>
    <d:DateAdded m:type="Edm.DateTime">2010-07-26T00:00:00-07:00</d:DateAdded>
    <d:DateModified m:type="Edm.DateTime">2010-07-26T00:00:00-07:00</d:DateModified>
    <d:Comments m:null="true"></d:Comments>
    <d:ContentType>image/jpeg</d:ContentType>
    <d:Exposure m:type="PhotoData.Exposure">
    <d:Aperature m:type="Edm.Decimal" m:null="true"></d:Aperature>
    <d:ShutterSpeed m:type="Edm.Int16" m:null="true"></d:ShutterSpeed>
    <d:FilmSpeed m:type="Edm.Int16" m:null="true"></d:FilmSpeed>
    </d:Exposure>
    <d:Dimensions m:type="PhotoData.Dimensions">
    <d:Height m:type="Edm.Int16" m:null="true"></d:Height>
    <d:Width m:type="Edm.Int16" m:null="true"></d:Width>
    </d:Dimensions>
    </m:properties>

  7. (Optional) The WCF Data Services client sends a new MERGE or PUT request (depending on the SaveChangesOptions.ReplaceOnUpdate setting) to update the newly created MLE in the data service with additional values from the client’s copy of the entity. This merge overwrites any property values set by the server, except for the entity key, and returns a 204 (No Content) response.

For a POST request, our implementation of GetWriteStream sets any non-nullable property of PhotoInfo (otherwise the Entity Framework provider raises an error) and returns a new FileStream based on a temp file.

public Stream GetWriteStream(object entity, string etag, bool?
checkETagForEquality, DataServiceOperationContext operationContext)
{
if (checkETagForEquality != null)
{
// This stream provider implementation does not support
// ETags associated with BLOBs. This means that we do not
// track concurrency for a media resource and last-in wins on updates.
throw new DataServiceException(400,
"This demo does not support ETags associated with BLOBs");
}

    PhotoInfo image = entity as PhotoInfo;

    if (image == null)
{
throw new DataServiceException(500, "Internal Server Error: "
+ "the Media Link Entry could not be determined.");
}

    // Handle the POST request.
if (operationContext.RequestMethod == "POST")
{
// Set the file name from the Slug header; if we don't have a
// Slug header, just set a temporary name which is overwritten
// by the subsequent MERGE request from the client.
image.FileName =
operationContext.RequestHeaders["Slug"] ?? "newFile";

        // Set the required DateTime values.
image.DateModified = DateTime.Today;
image.DateAdded = DateTime.Today;

        // Set the content type, which cannot be null.
image.ContentType =
operationContext.RequestHeaders["Content-Type"];

        // Cache the current entity to enable us to both create a
// key-based storage file name and to maintain transactional
// integrity in the disposer; we do this only for a POST.
cachedEntity = image;

        return new FileStream(tempFile, FileMode.Open);
}
// Handle the PUT request
else
{
// Return a stream to write to an existing file.
return new FileStream(imageFilePath + "image"
+ image.PhotoId.ToString(),
FileMode.Open, FileAccess.Write);
}
}

Note that the portion of this implementation that handles the PUT request simply returns a FileStream that accesses the image file based on the key value. 

After the POST operation succeeds, the temp file created by GetWriteStream is renamed to include the key value and moved to the app_data folder in the disposer, as shown by the following:

public void Dispose()
{
// If we have a cached entity, it must be a POST request.
if (cachedEntity != null)
{

        // Get the new entity from the Entity Framework object state manager.

        ObjectStateEntry entry =
this.context.ObjectStateManager.GetObjectStateEntry(cachedEntity);

        if (entry.State == System.Data.EntityState.Unchanged)
{
// Since the entity was created successfully, move the temp file into the
// storage directory and rename the file based on the new entity key.
File.Move(tempFile, imageFilePath + "image"
+ cachedEntity.PhotoId.ToString());
           

            // Delete the temp file.
File.Delete(tempFile);
}
else
{
// A problem must have occurred when saving the entity to the
// database, so we should delete the entity and temp file.
context.DeleteObject(cachedEntity);
File.Delete(tempFile);

            throw new DataServiceException("An error occurred. "
+ "The photo could not be saved.");
}
}
}

This also enables us to maintain some level of transactional consistency between creation of the MR and MLE from a POST request when using the Entity Framework provider; the photo service deletes the new image file if the MLE fails to be created.

Other Implementation Details

While GetReadStream and GetWriteStream are the primary methods that we implement, the IDataServiceStreamProvider interface contains several other members that are also required.

  • DeleteStream – this method is called by the data service to delete an MR. In our data service, this method implementation uses the File.Delete method to delete the image file, as seen in the following code:

    public void DeleteStream(object entity,
    DataServiceOperationContext operationContext)
    {
    PhotoInfo image = entity as PhotoInfo;

    if (image == null)
    {
    throw new DataServiceException(500, "Internal Server Error.");
    }

        try
    {
    // Delete the requested file by using the key value.
    File.Delete(imageFilePath + "image" + image.PhotoId.ToString());
    }
    catch (IOException ex)
    {
    throw new DataServiceException("The image could not be found.", ex);
    }
    }

  • GetStreamContentType – this method is called by the data service to determine the value of the Content-Type header to set on the response message. In our implementation, the content type is stored in the entity itself:

    public string GetStreamContentType(object entity,
    DataServiceOperationContext operationContext)
    {
    // Get the PhotoInfo entity instance.
    PhotoInfo image = entity as PhotoInfo;

    if (image == null)
    {
    throw new DataServiceException(500,
    "Internal Server Error.");
    }

        return image.ContentType;
    }

  • ResolveType – returns the namespace-qualified name of the type that is the MLE, which in our implementation looks like this (since we only handle one MLE type):

    public string ResolveType(string entitySetName,
    DataServiceOperationContext operationContext)
    {
    // We should only be handling PhotoInfo types.
    if (entitySetName == "PhotoInfo")
    {
    return "PhotoService.PhotoInfo";
    }
    else
    {
    // This will raise an DataServiceException.
    return null;
    }
    }

  • GetReadStreamUri – we return null to let the data service provide the default URI for the MR.

  • GetStreamETag – this version of the sample does not support the ETag header, which means that we do not track concurrency in the MR. This means that last-in always wins on updates.

  • StreamBufferSize – this property returns the size of the buffer to use for stream operations; our sample always returns 64000 (for a 64K stream buffer).

For more information about implementing this interface, see Streaming Provider (WCF Data Services) in the MSDN documentation.

Implementing IServiceProvider

Whenever a custom data service provider (of which IDataServiceStreamProvider is one) is implemented in a WCF Data Service, the IServiceProvider interface should also be implemented. IServiceProvider provides the data service runtime with a specific provider implementation, in our case the IDataServiceStreamProvider implementation that returns a PhotoServiceStreamProvider instance. The following code implements the GetService method required by IServiceProvider :

public object GetService(Type serviceType)
{
if (serviceType == typeof(IDataServiceStreamProvider))
{
// Return the stream provider to the data service.
return new PhotoServiceStreamProvider(this.CurrentDataSource);
}
return null;
}

Attributing the Model Metadata

In order for the data service to return MR data as a stream, it has to know for which entity to invoke IDataServiceStreamProvider. An MLE is identified by using the HasStream attribute. When using the reflection provider or a custom data service provider, HasStreamAttribute is applied to the entity type that is the MLE. Because our sample data service uses the Entity Framework provider, we must instead manually apply the HasStream attribute to the PhotoInfo entity in the CSDL portion of the .edmx file that represents our data model, as seen in the following XML fragment:

image

For more information about using the Entity Framework data provider, see Entity Framework Provider (WCF Data Services) in the MSDN documentation.

The HasStream attribute is also included in the model metadata returned by the service to client applications. When the WCF Data Services client finds the HasStream attribute on the PhotoInfo entity, it also applies HasStreamAttribute to the generated PhotoInfo class on the client.

Enabling Large Data Streams

The WCF Data Services runtime hosts a basic implementation of Windows Communication Foundation (WCF) that it uses for HTTP messaging. Because WCF limits the size of data streams, we must configure the data service endpoint to enable large streams. The following element configures our photo service to receive images up to 500KB:

<system.serviceModel>
<serviceHostingEnvironment aspNetCompatibilityEnabled="true"/>
<services>
<!-- The name of the service -->
<service name="PhotoService.PhotoData">
<!-- you can leave the address blank or specify your end point URI -->
<endpoint binding="webHttpBinding"
bindingConfiguration="higherMessageSize"
contract="System.Data.Services.IRequestHandler">
</endpoint>
</service>
</services>
<bindings>
<webHttpBinding>
<!-- configure the maxReceivedMessageSize value to suit the max size of
the request (in bytes) you want the service to recieve-->
<binding name="higherMessageSize" maxReceivedMessageSize="500000"/>
</webHttpBinding>
</bindings>
</system.serviceModel>

To accept binary streams larger than 500KB, you must set a larger value for the maxReceivedMessageSize attribute.

Granting Access to the IIS Process

The IDataServiceStreamProvider implementation uses a FileStream to access image files from both the temp file directory and the Web site’s app_data folder. We must grant at least modify access to the the process under which the data service runs to access those folders. This same access must be granted in the database itself by creating a new login for this account, and this login must be granted access to the database. The SQL script included in the streaming sample project creates the necessary logins in the server and grants access to the database.

Accessing a Photo as a Binary Stream from the Photo Data Service

At this point, our photo data service is ready to return image files as a stream. We can access the PhotoInfo feed in a Web browser. Sending a GET request to the URI https://localhost/PhotoService/PhotoData.svc/PhotoInfo returns the following feed (with feed-reading disabled in the browser):

image

 

 

 

 

Note that the PhotoInfo(1) entry has a Content element of type image/jpeg and an edit-media link, both of which reference the relative URI of the media resource (PhotoInfo(1)/$value).

When we browse to this URI, the data service returns the related MR file as a stream, which is displayed as a JPEG image in the Web browser, as follows:

image

 

In the next post in the series, we will show how to use the WCF Data Services client to not only access media resources as streams, but to also create and update image files by generating POST and PUT requests against this same data service.

Glenn Gailey
Senior Programming Writer
WCF Data Services