XPS: One format to rule them all (or maybe not)

Article
07/29/2008

One format or multiple formats?

Some would have you believe that one format does everything everyone will ever need. Some feel that, by extension, two similar formats, or two formats that have some degree of overlap, must be 'competitive' (or designed to be so) because "one format is all you need". On the other side of the fence is a viewpoint that you should use a highly tuned and format that’s optimized for a specific application… sometimes so optimized that it's a pain for anyone else to use.

I happen to fall into a more pragmatic group: try and use the best format for the job where ‘best’ is a combination of things like features, performance, interoperability, cost, ubiquity etc.

So where does XPS fit? XPS is designed to be an excellent format for electronic paper. It’s also designed not to address the wider requirements for electronic documents. There’s some good reasons why that design intent makes sense, and I’ll dig into some of them in this post.

What does XPS address?

As I said, XPS is designed to be an excellent format for electronic paper. By electronic paper we mean a digital representation of the content that you can print to, or scan from, physical paper. XPS supports the graphical primitives necessary for a great reproduction of paper content – basic stuff like graphics, text and images. But the way XPS supports these primitives is extremely robust. If you print something on physical paper you have everything you need to view the content (you don't need to go and find the fonts in a different drawer in your filing cabinet for example). XPS does something similar, all the resources needed to render the document are contained within the document.

What doesn’t XPS address?

By definition, if XPS is designed to support electronic paper, there’s a whole slew of capabilities that it doesn’t support. These include flashy things like animation, video, audio, dynamic content, scripting or macros. Interestingly, many people think that XPS needs to include these capabilities to be ‘competitive’. The problem with that is it assumes XPS is designed to compete with something that has those capabilities. It also misses the strengths that not supporting more features can provide.

Less is More

So how can supporting less features make XPS more valuable? Well, here's a few...

XPS includes only what you can print. This is a critical feature. By not including capabilities that can’t be printed, XPS ensures consistency between soft display and hard copy output. It also avoids potential temporal differences from including animation, video or other forms of dynamic content and thereby provides a greater level of trust between different viewers.
Implementation cost. By limiting the feature set XPS limits the implementation cost. This has two big benefits. Firstly, low cost of implementation promotes interoperability, encouraging a ecosystem of software and devices and ensures that users are not locked into a single vendors products. In the case of XPS, it also makes it possible for people to easily roll their own solutions. Secondly, low implementation cost significantly increases the quality of implementations. Quality does matter - more on that below.
Threats. The more features you add the more threats you have to mitigate. As operating systems and web browsers become harden against malicious attacks, the bad people go looking for other widely deployed technologies to try and compromise. Technologies that are designed with content sharing in mind are an obvious target. By limiting the feature set and expressly excluding capabilities that require programmatic code within the format, XPS significantly reduces the threat surface and makes it easier to implement the format securely. This is significant when you consider that document devices are becoming deeply integrated into enterprise networks, while at the same time are becoming more powerful and intelligent.
Simplicity. Formats that support lots of features often reach a point where people start defining sub-formats to scope the format down to a set of features that work for a certain scenario. Sometimes this is to reduce implementation costs for unnecessary features, sometimes it is because the use of some features would break certain scenarios (as a silly example, if asked to print content encoded as MP3 audio). This becomes a real problem when exposed to end users – you know, the people that have real work to do and who don’t spend all day thinking about document formats and the like :-). With end users multiple sub-formats become very confusing. How do you communicate which sub-format is required? How do users find the tools and configuration options to create the right sub-format? How do you check that you have been sent the right sub-format (and if you’re worried about security, how do you check without having to download it and look inside?). Here’s an example from the world of image formats. If I say send me a JPEG I have a much higher confidence that I’ll get what I expect than if I say ‘send me a TIFF’. You can do much more with a TIFF than a JPEG, but it’s also much easier to construct a valid TIFF that’s not suitable for a particular scenario and ‘Send me a JPEG’ works with far more people than ‘Send me a TIFF with this set of tags but don’t use any of these tags’.
Electronic paper workflows work. XPS is a bridge format between electronic paper and physical paper (using the analogy of freeways in the USA, it provides the on-ramp and off-ramp). By their nature, workflows that span physical and electronic paper work best when document peripherals (scanners, printers, fax devices etc.) are included. XPS enables those types of devices to participate in workflows as first class citizens and, by not including lots of features that are irrelevant for these kinds of workflows, ensures and high quality and consistent experience irrespective of device type or capabilities.

Implementation cost really matters

Implementation cost is a significant factor in quality. Lower cost means higher quality (or at least it typically does ;-). High quality implementations have numerous benefits, but here’s a couple of the less obvious ones to ponder…

Threat Surface. XPS, by design, reduces the threat surface, but you still need to appropriately protect that smaller threat surface that remains. Writing secure code is not trivial (although there are resources to help) and having a low implementation cost helps drive quality and security by enabling developers to spend more time implementing robust solutions. Further, having a simple format makes implementations less complex and therefore easier to secure.
Compliance and Compatibility. The more stuff there is to implement, the harder it is to get the implementation right (that applies both when you're following a specification, or specifying what you did). Mistakes happen, that's just software (and life). The problem is, mistakes mean implementations that don't meet the specification 100%. When that happens, and assuming you care about interoperability, you start having to worry about compatibility between different implementations as well as compliance with a specification. All other things being equal, having less to implement means the delta between compatibility and compliance is minimized (ideally, it's zero) which makes everyone happy. (and at this point it's worth noting that Microsoft implementations aren't always perfect, and neither are most when you look closely, so either this stuff is hard, or...)

But I want features that aren’t in XPS...

...great, there are plenty of excellent formats you can use instead. That's the point! Look at the tool box and choose the right tool (or the tool that compromises least) for the job in hand.

The exceptions

There’s always exceptions right? When I look at XPS today, and the goals of electronic paper, there are some things that are missing and that, personally, I hope get addressed at some point in the future. But they're all things that make sense within the scope of electronic paper so, that said, you can probably guess what they are…

XPS: One format to rule them all (or maybe not)

Additional resources