Open XML links for 10-11-2007

Andy Updegrove's "Meanwhile, Back in Minnesota: Your Chance to Help" provides information about how to provide feedback on document formats legislation to the state of Minnesota. The deadline for feedback is next Monday, so if you'd like to participate in the process now's the time to do it. You might want to share your views on whether choice is a good thing or not, whether governments should mandate specific formats rather than general guidelines, and related topics. The Massachusetts ETRM is a good example of one state's approach in this area. The state of Texas also did a study entitled "Estimated Two-year Net Impact to General Revenue Related Funds" that sheds some light on the costs involved in mandating document formats.

Wouter van Vugt's "Extracting data from a xml-mapped document" includes a handy XSLT for extracting the custom XML data from WordprocessingML. I've covered before how custom markup works in WordprocessingML, and although it has some unique benefits, one downside relative to custom XML parts is that the business data is interspersed with the Open XML markup. Wouter's sample transformation helps to simplify that messy detail.

By the way, if you're interested in creating documents with custom schemas attached using Microsoft Word (no programming involved), MSDN has a how-to article entitled "Create an XML document based on a custom Schema" that takes you through the steps involved.

Guy Creese has done some informal "IBM Lotus Symphony Performance Tests" to assess the performance of IBM's recently announced open-source suite ...

I put up a post last week about IBM's new Lotus Notes Symphony office software suite, saying that based on an article in PC World, it seemed to be sloow in loading and a significant consumer of system resources. In short, the free software had some hidden costs. Shazaam, I got a ping from IBM Analyst Relations along the lines of, mmm, a few facts are not correct and how about a briefing on the product?

Fair enough, I thought. I'm still waiting for that to occur. But in the meantime, I figured I'd download the software and try it out myself, so I could ask some intelligent questions during the briefing. At a summary level, here's what I found, when running the software on a Pentium 4 with 2 GB of memory:

  • On average, an IBM Lotus Notes Symphony app (Beta 1) takes three to four times as long to load as the comparable Microsoft Office 2003 product (with some significant outliers: e.g., 15 and 33 times as long).
  • An IBM Lotus Notes Symphony app (Beta 1) consumes more CPU at load time than the comparable Microsoft Office 2003 product.
  • An IBM Lotus Notes Symphony app (Beta 1) consumes three to five times more memory than the comparable Microsoft Office 2003 product.

It will be interesting to hear his thoughts after the analyst briefing.

And finally, speaking of performance, Zeth posted a comparison on Command Line Warriors this week about file sizes for a few document formats. He created a table showing the size of a document that only contains "Hello World" in a few different formats:

Format Application File Size (bytes)
.txt Emacs 21.4.1 11
.abw Abiword 2.4.6 2517
.odt OpenOffice Writer 2.20 6674
.doc Microsoft Word 2003 SP2 24064

Just to extend this research a bit, here are the results with the two editors I use most often:

Format Application File Size (bytes)
.docx Microsoft Word 2007 9870
.txt Notepad 6.0 11