MS Open Tech publishes HTML5 Labs prototype of a Customizable, Ubiquitous Real Time Communication over the Web API proposal

Prototype with interoperability between Chrome on a Mac and IE10 on Windows

From:

Martin Thomson
Senior Architect, Skype, Microsoft Corp.

Bernard Aboba
Principal Architect, Lync, Microsoft Corp.

Adalberto Foresti
Principal Program Manager, Microsoft Open Technologies, Inc.

The hard work continues at the W3C WebRTC Working Group where we collectively aim to define a standard for customizable, ubiquitous Real Time Communication over the Web. In support of our earlier proposal, Microsoft Open Technologies, Inc., (MS Open Tech) is now publishing a working prototype implementation of the CU-RTC-Web proposal on HTML5Labs to demonstrate a real world interoperability scenario – in this interop case, voice chatting between Chrome on a Mac and IE10 on Windows via the API.

By publishing this working prototype in HTML5 Labs, we hope to:

  • Clarify the CU-RTC-Web proposal with interoperable working code so others can understand exactly how the API could be used to solve real-world use cases.
  • Show what level of usability is possible for Web developers who don’t have deep knowledge of the underlying networking protocols and interface formats.
  • Encourage others to show working example code that shows exactly how their proposals could be used by developers to solve use cases in an interoperable way.
  • Seek developer feedback on how the CU-RTC-Web addresses interoperability challenges in Real Time Communications.
  • Provide a source of ideas for how to resolve open issues with the current draft API as the CU-RTC-Web proposal is cleaner and simpler.

Our earlier CU-RTC-Web blog described critical requirements that a successful, widely adoptable Web RTC browser API will need to meet:

  • Honoring key web tenets – The Web favors stateless interactions that do not saddle either party of a data exchange with the responsibility to remember what the other did or expects. Doing otherwise is a recipe for extreme brittleness in implementations; it also raises considerably the development cost, which reduces the reach of the standard itself.
  • Customizable response to changing network quality – Real time media applications have to run on networks with a wide range of capabilities varying in terms of bandwidth, latency, and packet loss. Likewise, these characteristics can change while an application is running. Developers should be able to control how the user experience adapts to fluctuations in communication quality. For example, when communication quality degrades, the developer may prefer to favor the video channel, favor the audio channel, or suspend the app until acceptable quality is restored. An effective protocol and API should provide developers with the tools to tailor the application response to the exact needs of the moment.
  • Ubiquitous deployability on existing network infrastructure – Interoperability is critical if WebRTC users are to communicate with the rest of the world with users on different browsers, VoIP phones, and mobile phones, from behind firewalls and across routers and equipment that is unlikely to be upgraded to the current state of the art anytime soon.
  • Flexibility in its support of popular media formats and codecs as well as openness to future innovation – A successful standard cannot be tied to individual codecs, data formats or scenarios. They may soon be supplanted by newer versions that would make such a tightly coupled standard obsolete just as quickly. The right approach is instead to support multiple media formats and to bring the bulk of the logic to the application layer, enabling developers to innovate.

CU-RTC-Web extends the media APIs of the browser to the network. Media can be transported in real time to and from browsers using standard, interoperable protocols.

clip_image002

The CU-RTC-Web first starts with the network. The RealtimeTransportBuilder coordinates the creation of a RealtimeTransport. A RealtimeTransport connects a browser with a peer, providing a secured, low-latency path across the network.

At the network layer, CU-RTC-Web demonstrates the benefits of a fully transparent API, providing applications with first class access to this layer. Applications can interact directly with transport objects to learn about availability and utilization, or to change transport characteristics.

The CU-RTC-Web RealtimeMediaStream is the link between media and the network. RealtimeMediaStream provides a way to convert the browsers internal MediaStreamTrack objects – an abstract representation of the media that might be produced by a camera or microphone – into real-time flows of packets that can traverse networks.

Rather than using an opaque and indecipherable blob of SDP: Session Description Protocol (RFC 4566) text, CU-RTC-Web allows applications to choose how media is described to suit application needs. The relationship between streams of media and the network layer they traverse is not some arcane combination of SDP m= sections and a=mumble lines. Applications build a real-time transport and attach media to that transport.

Microsoft made this API proposal to the W3C WebRTC Working Group in August 2012, and revised it in October 2012, based on our experience implementing this prototype. The proposal generated both positive interest and healthy skeptical concern from working group members. One common concern was that it was too radically different from the existing approach, which many believed to be almost ready for formal standardization. It has since become clear, however, that the existing approach (the RTCWeb protocol and WebRTC APIs specifications) is far from complete and stable, and needs considerable refinement and clarification before formal standardization and before it’s used to build interoperable implementations.

The approach proposed in CU-RTC Web also would allow for existing rich solutions to more easily adopt and support the eventual WebRTC standard. A good example is Microsoft Lync Server 2013 that is already embracing Web technologies like REST and Hypermedia with a new API called the Microsoft Unified Communications Web API (UCWA see https://channel9.msdn.com/posts/Lync-Developer-Roundtable-UCWA-Overview). UCWA can be layered on the existing draft WebRTC API, however it would interoperate more easily with WebRTC implementations if the standard adopted would follow a cleaner CU-RTC-Web proposal.

The prototype can be downloaded from HTML5Labs here. We look forward to receiving your feedback: please comment on this post or send us a message once you have played with the API, including the interop scenario between Chrome on a Mac and IE10 on Windows.

We’re pleased to be part of the process and will continue to collaborate with the working group to close the gaps in the specification in the coming months as we believe the CU-RTC-Web proposal can provide a simpler and thus more easily interoperable API design.