Rant: Xml is Code


I need to rant: If one more person brags about their class library saying “you don’t have to write a line of code to do <insert cool thing here>, you just need to give it 8 pages of machine generated XML”, I’m going to scream.  My perception is that these people believe “code is hard and xml is easy; so if we can transform a problem from code to xml, we make it easy”. There’s certainly some truth to that, but my grievance is that  XML is code:



  1. It looks like code.

  2. It acts like code.

  3. It’s provably computationally equivalent to code: they’re both just ASTs; they both have a syntax and semantic structure.

  4. In these cases, the XML is often sufficiently long and complicated that people want to be able to debug it and set breakpoints in it.

So let’s stop pretending that it’s not code. Clemen Vasters said “XML is the assembly language of Web 2.0“, and I think that makes a lot of sense.


Don’t get me wrong: I love XML. It solves the lexing/parsing problem, and writing parsers is annoying. And it also paves the way for more useful libraries and it lets you start dealing with data at a higher level (serialization, XSL transforms, validation, xpath queries, smart diffs, etc).  And if the choice is between 10 lines of XML or 10,000 lines of C++; then obviously I’m going to choose the XML. But by the time your XML files are so complicated that you need a tool to generate them for you and you’re asking for a debugger to debug them, you’ve got to drop the pretense that the xml is ‘easy’.

Comments (21)

  1. I’d agree that XML is code, and in most of the newer applications it’s not any easier than C#.  I think the biggest opportunity that the XML provides is to think declaratively rather than imperatively.  It’s not any easier, but done right, a declarative XML grammer can be much easier to understand, if not to write.

  2. Mike Stall has decided to vent on Xml and picks up Clemens’ quote&amp;nbsp;which I picked up on at TechEd….

  3. Greg says:

    XML is not code.  Its a markup language that relies on parser(written in code) to process/transform and present it.

  4. mschaef says:

    "It’s provably computationally equivalent to code: they’re both just ASTs; they both have a syntax and semantic structure. "

    I agree with the syntax part, the semantic part, not so much. Depending on the Schema, XML might be computationally equivalent to C# 3.0, but given the wrong Schema, it might just be computationally equivalent to a list of numbers. Erik Naggum summed it up pretty nicely a few years ago: "Please remember that SGML and XML rely on an _application_ in order to

     imbue their syntactic verbosity with semantics of any kind. … Structure is nothing if it is all you got. Skeletons spook people if they try to walk around on their own. I really wonder why XML does not.

    " (http://groups.google.com/group/comp.lang.lisp/browse_thread/thread/6812d19d7e252ee1/7d410e0ae791d1cb?lnk=st&rnum=1#7d410e0ae791d1cb)

    From a slightly different angle, I wrote on this myself (much less insightfully than Mr. Naggum, I’m afraid) a couple days ago.

    http://www.mschaef.com/cgi-bin/blosxom.cgi/tech/general/whither_xml.txt

    Anyway, just my two cents worth…

  5. dimkaz says:

    Xml is often worse than code.

    Every xml config file or whatever is DSL, for we which often we don’t have a compile time checking/intellisence is limited and every time it’s a new language that one has to learn.

    It’s the opposite from technologies like LINQ. There the goal is to have one language.

  6. jmstall says:

    Greg – XML can be declartive, but so can normal code. "code" (eg, C++/C#/perl/etc) is just a text file that requires a compiler/interpreter to process/transform it.

    What’s the difference between:

      for(i = 0; i < 10; i++) print(i);

    and:

    <for iterator="i" start="0" end="0">

     <do>

       <print><expression>i</expression></print>

     </do>

     <next>

       <assign value="i">

         <add>

           <expression>i</expression>

           <expression>1</expression>

         </add>

       </assign>

     </next>

    </for>

  7. Texrat says:

    Mna’s got a point.  ; )

  8. mschaef says:

    Strictly speaking, that’s not XML. What it is is XML+some-schema-that-gives-it-the-meaning-of-a-programming-language. The only thing XML defines is the syntax, leaving all the more interesting bits to something else that might not be a formal specification at all.

    I hate to mention the language Lisp, but there are a lot of languages with Lisp like syntaxes based on s-expressions. Some of these languages are radically different in expressive power than others. Some are standards, some are not. Just because you have syntax doesn’t count for much.

  9. James says:

    Well said, if one more talking head brags how great ASP.Net 2.0 is because you can do so much without writing any code, I’m gonna snap. All they did was move the the stuff from C# to the markup. At least I could debug the C# code when it wasn’t working.

  10. jmstall says:

    mschaef – you may be right on the technicalities; but the axe swings both ways:

    Strictly speaking C# files are just a text file+some-schema-that-gives-it-the-meaning-of-a-programming-language. (Where that Schema is enforced / implemented by the C# compiler).

    Imagine if all the talking heads went around saying "our library is great, you don’t need to write any code, you just feed it text files through a processor [ie, a compiler]".

  11. Marc Brooks says:

    XML is the RPG of the web generation.

  12. Greg says:

    I think your example looks alot like XEXPR.  Am I correct?

    If so, that is a scripting language that uses Xml to express its syntax, but still requires a processing engine to mean anything.  And Xml is still just a loosely defined data structure.

    To stay simple, to me, code can process complex logic and work with many data types implicitly.

    I think we can go back and forth on this forever.  But to me, just like HTML is not code, XML is not code.

  13. Jay says:

    XML is not code.

    Code implies logic.

    You can have fully valid XML with no logic included in the text.

    You can not write a program that doesn’t have any logic included in the text of the program.

  14. mschaef says:

    "Strictly speaking C# files are just a text file+some-schema-that-gives-it-the-meaning-of-a-programming-language. (Where that Schema is enforced / implemented by the C# compiler)."

    The difference is that in the case of C#, the ‘schema’ of the language is part of the official definition of the language (in the case of C#, the official ECMA standard) and something you can assume as being part of a confirming C# implementation. With XML, you can’t assume nearly as much: just imagine what C# would be without all that detailed thinking that went into the semantics of the language (eg: http://blogs.msdn.com/ericlippert/archive/2006/06/27/648681.aspx). There are so many details apart from the syntax that are so important, that the C# standard would be essentially vacuous if it didn’t include the semantics of the language.

    In a sense, this is part of the problem your original blog post addresses (and which I agree with).  I’d argue that the cases you refer to here [1] are really just instances where you’re feeling the pain caused by XML’s lack of innate meaning. Since XML doesn’t mean anything in and of itself, you have reimplement the whole tool chain from scratch, debugger, semantic analyzer, code generator and all.  The only thing XML bought you is  a lexer and most of a parser. Whoopee: I have an s-experession lexer and parser that are implemented in 800 lines of loose C code.

    1] " But by the time your XML files are so complicated that you need a tool to generate them for you and you’re asking for a debugger to debug them, you’ve got to drop the pretense that the xml is ‘easy’.

    "

  15. Jonathan Allen says:

    As soon as XML starts containing logic, it becomes code. Real, honest-to-god, machine-controlling code.

    Sure there is plenty of XML that isn’t code. But that doesn’t change the fact that some XML really is code.

    And having worked on a J2EE project, I can say with experience that XML code can be just as buggy as code written in any procedural language.

    What makes things worse is that there isn’t any one XML-based programming language. No, we have hundreds (thousands?) of domain-specific languages all with their own syntax and semantics.

  16. jmstall says:

    Just because something isn’t in the same class as the halting problem doesn’t mean that it’s not code.

    Take SQL queries for example. There’s a whole field of writing and debugging them. Tell somebody who’s written 10 pages of sql queries for their app that they aren’t writing code, and they’ll shoot you.

  17. Greg says:

    Interesting topic and good arguments.  BUT, Xml is a Markup Language and nothing more.  At best is a loosely defined database.  Where Xsd defines the Schema, Constraints, etc…

    Xslt, XPath and XEXPR (which BTW use Xml for syntax) blur the lines a bit and you could make a more valid point that expressing logic in either of those Languages MIGHT BE codeing, but you will never convince me that Xml is code.

  18. Uncle Wiggly says:

    Christ, so many of you nitwits are missing the point so completely.  Dogbert is coming over to slap each of your point little heads.

    jmstall isn’t talking about technical definitions or processors or compilers or declarative cf. imperative (extra dodo points for that special nitwit).

    He’s stating the obvious : XML as it is being used in the real world does what code does, is as tough to build and maintain as code is, and therefore should be acknowledged to be code.  

    That’s all.

  19. Anarchy says:

    XML can certainly contains many processing instructions which are used by the parser to affect program flow, so business logic and therefore the complexity can be moved from code to XML.

  20. When people are asking for a debugger for language X, practically it means that the usage of language