WSDAPI 101 Part 2: The stack, explained through a concrete example

This is the second article in the WSDAPI 101 series. You'll learn the most by starting at the beginning, although it's not required to understand the content in this article.

The first article in the WSDAPI 101 series presented an overview of the actors involved in Web Services (contracts, messages, clients, and services) and described how they interact to make network communication straightfoward.  However, this high-level view doesn't describe the nuts and bolts, and that's what we'll get into in this article.

An example scenario: networked webcam
I promised a concrete example, so here it is: an IP-connected webcam.  This webcam talks Web Services (specifically, the Devices Profile for Web Services, or DPWS) and is a service that deals with clients, messages, and contracts in ways outlined in the previous article.  So in specific, the webcam service exposes a webcam contract, which describes a handful of messages that can be passed between a client and the webcam service.

I'll skip over a lot of the detail (wait for a later article on WSDL) but the webcam service looks approximately like a C++ class or C# interface, which exposes methods that allow you to invoke webcam functionality.  The most important operation would be GetWebcamImage(image *imageOut) which grabs the most recent frame, but you can imagine there are other minor operations like SetWebcamResolution(int xResolution, int yResolution) and so forth.  Clients send messages to the service that clearly indicate that they're issuing GetWebcamImage or SetWebcamResolution requests.

In this example, we'll be writing a client application that exercises the functionality on this camera.  The fundamentals are identical if you're writing a webcam service--although you'd receive calls into your service object instead of issuing calls into the Web Services stack.

The non-WS way: two layers
Before we talk about the Web Services way of solving this problem, let's take a step back and imagine that you have to implement a webcam client using only OS components, like sockets.  Let's say that your IP-connected webcam talks over raw sockets, and you have to write your own network protocol for moving data back and forth.

If you were to write an application that exercised this webcam's functionality, you'd have essentially two layers inside your application:

  • Your application layer, which is responsible for presenting a UI, managing the logic that determines when to make network calls, and everything that you need to wire into the network sockets.  This application would present a UI at the top layer, and would be responsible for rendering a webcam image into the UI or onto a file on disk.
  • The OS and framework components, which present the interfaces for making those raw socket calls.  For this app, you'd probably call into some Winsock functions.

Many of you would probably abstract out some of the webcam calls inside your application layer so that your high-level UI code could simply call into GetWebcamImage(image *) and SetWebcamResolution(int, int).  But, you'd have to write both sides of this abstraction on your own.

The Web Services way: three layers
Modern WS applications, however, typically have three layers: there's your application code (which contains only the application-specific logic), there's a generated code layer, and then there's all of the OS and framework components.  We'll go through these piece by piece.

  • Your application layer presents the same UI functionality (and app logic) but doesn't have to make socket calls.  Instead, you can call directly into the GetWebcamImage and SetWebcamResolution methods directly, because those are implemented by the generated layer.
  • The generated layer glues your application to the underlying components (specifically, the WS stack).  This generated layer receives calls from your application layer (e.g., GetWebcamImage(image *)), unpackages the parameters, and then turns them into generic calls that the WS stack can understand.  This code is generated when you start building your application, and is done with a service modeling tool (e.g., WsdCodeGen.exe) and a WSDL.  More on this in the next article.
  • At the bottom, the Web Services stack (which relies on OS and framework components) does the connection and messaging work.  The important thing to note is that this stack is probably a binary component, and unlike the generated code, isn't specialized for your application.  The same binary is used for your webcam app, a DPWS printer app, the app that connects to network projectors, and so forth.

Do bear in mind that when talking generically about "Web Services implementations," these are all generalizations.  There are a million different ways to build Web Services stacks, and not all of them look exactly like this.  However, WSDAPI follows this pattern, and Windows Communication Foundation in .NET is very similar.

Enough overview. What do these interfaces look like?
If your app is built on WSDAPI, here's how the interfaces would lie out.

  • At the very top you'd have your application layer--and it exposes whatever interface you want.  This could be a GUI, or it could be a programming interface.  Either way, it's up to you.
  • The generated code exposes a class that represents your webcam, and exposes methods that each correspond to the operations that are available on the webcam.  This interface is defined by the WSDL that describes the contract that the webcam exposes.  The resulting C++ class definition for this interface would probably look like this:
     class webcam : public IWebcam
    {
    public:
        HRESULT GetWebcamImage( /* [out] */ image *imageOut );
        HRESULT SetWebcamResolution( /* [in] */ int xResolution,
                                     /* [in] */ int yResolution );

        // Code follows to manage IUnknown
        ...
    }
  • Lastly, these generated methods call into WSDAPI, which presents raw interfaces for sending messages.  In specific, the generated implementations for GetWebcamImage and SetWebcamResolution would both call into the IWSDEndpointProxy::SendTwoWayRequest method.  Inside IWSDEndpointProxy, everything is hidden--but since DPWS traffic takes place over HTTP, you can imagine that SendTwoWayRequest will eventually call into some HTTP interfaces and so forth.

And that's it!  As you can see, the bulk of the complicated messaging work is handled by the Web Services stack, and the generated code layer lets you use that stack without having to deal with building generic parameters that the stack can understand.  It's very powerful, and lets you build networked applications very quickly.

The next article will dive deeper into the generated code layer to explain what's involved in turning these simple method calls into generic messaging calls that WSDAPI (or other WS stacks) can understand.  See you next time!