Binary XML: A Stillborn Spec.


I read with continued dismay another article about binary XML and the bizarre Fast Infoset project. Tim Bray must be crying in bed at night over Sun’s continued persistence that ASN.1 should have been the data interchange format for the world. It lost, XML won and XML works. Go and solve other more useful problems. You only have to look at the failure of XML 1.1, as it split the XML standard into incompatible versions to draw the same analogy with binary XML. CPUs continue to fall in price and if needs dictate, XML processing in hardware will take off to solve the issues of processing time. Not yet another proprietary data format.

Just as bad is the W3Cs continued mismanagement by creating working groups such as the Binary Characterization Working Group to trying and determine all the known uses cases that a binary format could apply to. Given that they “own” XML this is a bit like cutting off your nose to spite your face. It also is interesting to observe that very little of significance has emerged from the WC3 for around three years now. XQuery, although significant, is going to turn up very late to the party blaming traffic. XML 1.1 has caused everyone to run around like headless chickens wondering how to sort out the mess where, the smallest of updates can cause the maximum breaking change. And XML Schema is such a confused child no one quite knows how to approach it, for fear of having to spend years understanding its complex irrational behavior. The W3C needs to desperately find some inspirational individuals, nay dictators, to drive some independent innovation if it intends to “ship” anything significant in the next five years.

Comments (12)

  1. Would’nt it be possible to have an XML Root tag that belongs to a defined W3C namespace that declares the innerXML as compressed.

    <w3c:Compression xmlns:w3c="http://www.w3c.org/CompressionSpec&quot; method="w3c:Deflate">base64Binary Actual XML Data</w3c:Compression>

    Then XML processors will know that they must uncompress the innerXML to get the actual XML document. Since all the fuss is about the bloat of XML and since XML is just text which lends itself to compression, we need do no more than use simple compression.

  2. While I’m not sure where Fast Infoset is going (though people are asking for it), calling it a proprietary format is FUD and XML doesn’t solve the format problem as it is only a meta-language.

  3. Tarzan says:

    XML Works, huh? Have you ever used it for anything serious?

    XML is fat. It’s, say, 90% fluff. Tags, extra characters, required whitespace, and so on — it all adds up to stuff I don’t need to express my data. If it doesn’t add value, get it out. Otherwise, you’re necessarily making bloatware.

    Open data formats are nice, when they’re appropriate. The problem with the openness zealots is they think everything should be transparent. It’s okay (and I’d even say necessary!) for some things to be below the implementation line.

    I don’t want to document some parts of my file formats even if my file format purports to be self-documenting. Next thing you know, your users are partying on files and structures they don’t know anything about, you can’t support implementing enough validation over the stream and your goose is cooked.

    XML is slow. Since it’s so inefficient at representing data, anything that reads or writes XML does dozens of times more I/O than it really needs to do. Because the structure is so rich and flexible, it’s not trivial to parse. Just check out Microsoft’s implementation: System.XML is the most loathed part of of .NET because it’s the fattest… at load time, anyway.

    It is XML that is the data interchange format for the world, then? "The world" includes plenty of connected and intelligent devices that can’t realistically support XML, and plenty of applications that by-design shouldn’t.

    You’re right that Binary XML isn’t a solution to any of these problems.

  4. FixXML says:

    Instead of bitching about others trying to come up with new ideas about fixing the defficiencies of XML, why don’t you help instead? Any proposals?

    XML is defficient in:

    – Binary content (ie. TV Streams)

    – Relational content (ie. Tables)

    – Untagged content (ie. EDI)

    Fix these 3 problems in XML and we won’t be asking for new formats.

  5. Shailesh says:

    I did implement binary xml in .NET.

    I believe it is definitely possible to come up with an agreed upon binary format, which may not be optimial for every situation (neither plain XML is or for that matter any of the W3C standards!), but would definitely accelerate overall XML processing.

    In addition, it’s not only the compression, but we also want the ability to exchange a pre-parsed and typed data or infosets. If flat XML is not ideal, people will be forced to use some binary format anyway.

    Actually, based on my experience, the infoset binarization + compression yields a lot of benefits.

    If binary XML is not good, then it will fade away on its own.

    I think there are lots of ‘non-technical’ obstacles in arriving at an agreed solution and there are too many ‘me-too’ vendors out there who want to push their own formats.

  6. XML is probably the greatest technology ever for creating foil-ware. However, out here with the rest of us it has some serious issues.

    CPUs may be getting cheaper, but they aren’t getting any faster. A multi-threaded parser is yet to be seen. Even more serious is the fact that network throughput is not getting anywhere.

    Try serializing a 50 000 item vector of real precision numbers in XML over an ISDN dial-up.

  7. Anonymous says:

    Shaun O’Callaghan’s Blog &raquo; Binary XML and Web Services

  8. XML isn’t just CSV … unless you limit yourself to thinking that way.