Convert OpenXML files to HTML on sharepoint


The sharepoint team recently posted an article up on OpenXMLdeveloper.org on how they allow you to convert files in the Office Open XML format into HTML directly on the server: http://openxmldeveloper.org/articles/MOSSconvert.aspx

When they started investigating this solution, I made sure to point them at the XSLT that we build in Office 2003 to convert the earlier version of WordprocessingML into HTML (http://blogs.msdn.com/brian_jones/archive/2005/09/30/475794.aspx).

This article goes into the details on the challenge they faced with images as images are embedded by default in OpenXML and HTML doesn’t allow for embedded images.

-Brian

Comments (2)

  1. John says:

    Hello Brian,

    let me first congratulate for completion of Office 2007. Great suite, great software. But considering  the file formats, nearly all of my doubts that i have written on this forum (and which resulted in making my post a blog entry) came true. Let’s see how the final situation looks like.

    1) Office converters for older versions of Office are not available for Office 97 and lower. Goal achieved – you are forcing again the users of older versions to upgrade to use new formats – even if they are satisfied. I doubt there are huge technical reasons for this. The cause may be, that if you did not support e.g. Office XP, more users would  more angry. You are very smart at market decisions – so you know how many people will be harmed by your decisions, and are again using your new version as a "upgrade force tool". Not mentioning OneNote 2007, which is not compatible even with its direct predecessor OneNote 2003. Here the users are more willingly to upgrade (and onenote is not so "seen"), so you force to upgrade without any public talk.

    2) ODF converter is available. It has key features – it is slow, not compelete, you cannot double click a ODF file to open it in Word (!),  does not integrate into Office well. And the users cannot contact Microsoft for support, you were smart enough to delegate responsiblity to external subject. The neccesary compromise between what the users wanted (especially EU) and you had to do. If you intended to support ODF, you would integrate it as you did with PDF, XPS , or e.g. WordPerfect filter. Microsoft is known for its user friendliness – really. One can only wonder why in this particular case is ODF support so unfriendly 🙂

    3) OpenXML is great … and again, nobody in the world can now edit complex DOCX document created in Word with 100% (or at least 99%) fidelity. Maybe this will change in the future, but I am in doubt, I think you made sure that this will not be possible. If I am wrong, let me know when it is – e.g. when I can easily send DOCX file to my friend without Windows and    be sure than he can open it. Simple need, still no solution (what about a multi-platform open source SDK – DLL set which would allow edit of your files :-))

    So, despite all that talk and hype, considering the file format, the only real benefit is that the specification is publicly available. Open XML is great, Office 2007 is the best suite available – without irony. But considering the file compatibility and interoperability, it honors with some exceptions Microsoft traditional approach – not to open itself unless it it absolutely necessary, using "embrace and extend" strategy, and not contribute to the software community only it is absolutely necessary to do so.

  2. jones206@hotmail.com says:

    Hi John, thanks.

    1) I’m sure it doesn’t seem like it, but any application you want to support brings on a large amount of additional work. The converters can actually be run in a standalone mode (ie you right click on a .doc file and convert it to .docx) so if you have Office 97 you can still leverage them. They just won’t be as directly integrated as in other cases. There is also an OS limitation, where they won’t run on Win9x; and that again was because of the development costs. Supporting them on Win9x would have more than doubled the development effort. The reason for that is that the converters were built using the Office 2007 code base, and Office hasn’t been supported on Win 9x for a few releases. So we would have had to add all kinds of code to account to differences in Win9x, and at the end of the day it wasn’t worth the cost given ths small percentage of customers still running win9x. The same is true for Office ’97 (less the 5% if I remember correctly).

    Everything we do has a cost associated with it, and any work you take on means that you have to cut other planned work. I think the fact that we support the last three versions of Office for free is an outstanding accomplishment (it was something I pushed very hard for). I’m sorry you don’t feel the same way.

    2. The ODF converter is far from complete. I’m sure they’ll add the double-click functionality at some point (and if not you should go suggest it). The solution is 100% open source, so anyone is free to make suggestions, or actually provide the code themselves.

    3. It’s still pretty early, but we’ve already seen that the Novell folks are going to take the code from the ODF -> Open XML converters and apply it to OpenOffice. So, at some point in the next few months you’ll be able to open Open XML files on any platform that OpenOffice supports. Corel has also announced that at some point early to mid 2007 they’ll have support for Open XML in Wordperfect. Apple worked directly on the Ecma spec, and while they play things pretty close to the vest, I hope we’ll see OpenXML support from Apple some point next year too.

    I’m sorry you have so much pessimism, but you’re entitled to your own opinions. To be honest, I’m no longer too concerned about changing people’s minds when they have negative views on our work. I’d rather just focus on letting everyone know about the technologies and helping those folks who are interested in using the formats get off to a positive start.

    Happy Holidays

    -Brian