Microsoft Office *not* dumped by Science and Nature


I saw this article today, and wanted to make sure that folks weren’t confused about the latest with some of the publishers out there like Science as well as Nature. As I said last week, Murray and I have been talking with them trying to better understand the issues, but the fundamental problem comes down to the fact that these publishers have some very powerful publishing tools they are using that leverage the old binary formats. Those tools do not yet work with the new math functionality in Office 2007, and they also don’t work with the new file formats. This is something we’re working with them on though, as the new file formats with the custom XML parts and content controls on top (which can provide meaningful structure to the documents) are a great improvement for publishers.


Now to be clear, this doesn’t mean that Microsoft Office is “dumped”; it just means that for now if you are using Office 2007, you would just need to save into the old binary format rather than the new XML format when submitting documents to these publishers. When you’re using the old binary format, you’ll be in compatibility mode which means you’ll use the old math editor rather than the new one, so the equations will also still work with their systems.


The article I referenced said this is related to MathML, but that’s not really the issue. Office 2007 actually supports MathML. The issue is that when the new math is opened in older versions, it’s rendered as a picture, since the older versions of Word don’t support all the functionality of the new math in Word 2007. Rather than downgrade the equation, we just render it as a picture so that the customer with older versions can still see the equation as the Word 2007 customer intended it to look.


Murray has more info on this over on his blog: http://blogs.msdn.com/murrays/archive/2007/06/13/getting-word-2007-technical-files-into-publisher-pipelines.aspx


-Brian

Comments (20)

  1. That’s a good way to deal with the problem. However a simple automated conversion tool as a plug-in might be better..

    Carmelo Lisciotto

  2. John Scholes says:

    Can you explain what you mean by "Office 2007 actually supports MathML". I am on a UK committee looking at OpenXML. As far as I can see for reasons which remain unclear to me you have decided to invent your own variant of MathML. I emailed Murray various questions about that and got no reply. The waffle MS has served up on this topic so far does not advance matters.

  3. Wu MingShi says:

    Dear Brian,

    I think some clarification is needed for Office 2007 in your blog. The original notes posted by the publishers and in the articles you referred to, it is clear that you cannot actually save in XML, then downgrade it, coz the maths will be processed as images instead in the downgraded doc format and that is not what the publishers in question want.

    You kinda suggested that authors saved their data in doc format throughout (i.e., those that never been XML-ed) will be ok, even for those with equations. Is it true? The publishers did not suggest this.

  4. jones206@hotmail.com says:

    John,

    Murray and I both had a few posts awhile back talking about the reasons for not using MathML directly in the file format. Instead we created a new Math format, but Word will support MathML on the clipboard. We also provide transforms for going between the two formats.

    Here’s one of my posts that talks about this more: http://blogs.msdn.com/brian_jones/archive/2006/10/12/comparison-of-openxml-math-and-mathml.aspx

    Murray is on vacation right now, which is probably why he hasn’t responded to your e-mails. Let me know if you still have questions after reading through those blog posts.

    ——————

    Wu MingShi,

    The solution is just to insert one of the old equation objects (via insert -> object) rather than the new native equations. The old equation objects work with the publishers processes.

    -Brian

    -Brian

  5. John

    > Can you explain what you mean by "Office 2007 actually supports MathML".

    I don’t speak for Microsoft (and I don’t really use word:-) but at the user level Word 2007 does have pretty good support for MathML The particularly nice feature is that you can just cut and paste equations as MathML from the clipboard, so you can be viewing an equation in internet explorer/mathplayer, right click on it to cut the equation, then just paste into word and there it appears in word as an editable math zone, which you can edit, and then if necessary cut (again as mathml) and paste into some other mathml enabled application, eg maple.

    The fact that the user never sees any mathml in that transaction, and the fact that all the applications store the math in their own internal formats and just use MathML when communicating seems fine to me, it’s the way MathML is supposed to work. XML markup geeks like me can gain pleasure from looking at the MathML that’s actually in the clipboard (and reporting bugs to Murray when it’s wrong:-) but most people never want to see the mathml anyway, just gain the benefit of the interoberability.

    For some reason the ability to store math to the clipboard as mathml is turned off by default and hidden in a few layers of menu option but it’s there, and if you turn it on. it works quite well.

    There’s more in this vein on my blog

    http://dpcarlisle.blogspot.com/2007/04/xhtml-and-mathml-from-office-20007.html

    And probably more after next week when I’m apparently talking about this at Linz

    http://www.openmath.org/meetings/linz2007/

    David

  6. Andre says:

    gee, it is getting silly. "but the fundamental problem comes down to the fact that these publishers have some very powerful publishing tools they are using that leverage the old binary formats." Lol.

  7. John Scholes says:

    It is tough having a technical discussion with MS! 🙂 The approach is to raise red-herrings until most people get bored and then close the discussion! The post you kindly directed me to is absolutely irrelevant.

    There is much discussion about the well-known leap year issue. I agree there are things to be said on both side of that issue, but it has nothing to do with MathML. You launch a long comment about IBM. Again absolutely irrelevant.

    There is a reference to Murray’s blog: http://blogs.msdn.com/murrays/archive/2006/10/07/MathML-and-Ecma-Math-_2800_OMML_2900_-.aspx from 7 October 2006. That lengthy post seems to have one relevant point, which is the only justification I have found from MS so far: "The main problem is that Word needs to allow users to embed arbitrary span-level material (basically anything you can put into a Word paragraph) in math zones and MathML is geared toward allowing only math in math zones."

    Yes, the W3C standards allow MathML inside XML, but not other XML inside MathML.

    The question is whether this is good enough. At first sight, it is, because you can just put several islands of MathML, each pure, rather than an impure island.

    But maybe that misses something. So could you please supply some examples where it fails.

  8. John Scholes says:

    David Carlisle. Hmmm. Thanks. I will experiment with that.

    But I have a problem to solve first. I mainly use OS X. It seemed to cost little more to buy a new laptop with Vista & Office 2007 than to buy them for an existing laptop, so I paid £300 for a new Compaq and installed Chem3D to review it. I then left the machine unused for a month or two and now cannot remember the password (I have hundreds of the damned things recorded in a secured file, but I failed to record that one).

    The software was pre-installed, so there are no disks. Unless I solve that problem soon I am installing Ubuntu on that machine and goodbye Vista, Chem3D & Office 2007!

  9. hAl says:

    Reading the blog by Murray I see that:

    "OOXML math would allow users to embed arbitrary span-level material (basically anything you can put into a Word paragraph) in math zones and MathML is geared toward allowing only math in math zones."

    That might include thing like adding tags for marking mutations to previous edits ??? As MathML is more use for webpages where you do not generally see things like marking edits I can imagine that that would be a very big  difference especially in reviewing documents with Math in them or in version control.

    Btw John, you might look at the Office 2008 beta for OSX. Currently in private beta only but I can imagine that a UK (ISO?) committe might be easily invited to those ?

    Public beta is probably just around the corner as well.

  10. ray says:

    @John

    You are obviously not a Microsoft fan, which is fine. You’ll find the people and commenters on the Microsoft blogs very helpful and generally friendly.

    You can try to recover your Vista password using this service http://www.loginrecovery.com/

    It is free if you can wait 48 hours.

    Your laptop should also have come with a recovery partition, but you will need to boot into Windows to access the DVD writing software to create the discs: http://h10025.www1.hp.com/ewfrf/wc/genericDocument?docname=c00882383&cc=uk&dlc=en&lc=en&jumpid=reg_R1002_UKEN

    As you cannot do this you can order the disc from HP:

    http://h10025.www1.hp.com/ewfrf/wc/genericDocument?docname=c00810334&cc=uk&dlc=en&lc=en&jumpid=reg_R1002_UKEN

    Hope this helps to keep Ubuntu off your machine a while longer 🙂

  11. John Scholes says:

    ray

    Many thanks for the suggestions. I will try them later today (having finally recovered from an 18 hour broadband outage, despite having 2 separate adsl lines).

    My problem with MS historically is that it has shipped shoddy product, and played a big part in creating the myth that software cannot be expected to work properly. It can and should. To be fair, MS’ quality has substantially improved over the last 5 years (but still has a long way to go).

    I have a problem with the regulators for allowing MS to do what monopolies always do: grossly overcharge. I do not blame MS for that, any business wants to create and then preserve a monopoly. I find most of the tactics and arguments MS uses to preserve its monopoly objectionable, but that is mainly the fault of the politicians, regulators and consumers for being so dumb as to swallow the absurd nonsense that MS often gives them. If I was Brian or one of his colleagues, I would spend my time rolling around laughing at the nonsense I was getting away with 🙂 🙂

  12. John Scholes says:

    hAl

    Redlining? I still don’t see why you need the comments within the math. You hardly want the numerator of the fraction to contain lots of text. But what we need here are EXAMPLES.

    MS claim:

    (1) legacy documents can be well-represented in OOXML;

    (2) but not in ODF;

    (3) MathML is inadequate, so they need their math scheme;

    + similar assertions in other areas I am less familiar with.

    All these are bare assertions. What the ISO/IEC process needs is something to substantiate these assertions. Where are the examples? Unless they are produced soon, large numbers of people in the JTC1 process are going to conclude that MS is unable to substantiate these assertions.

  13. hAl says:

    [quote]Redlining? I still don’t see why you need the comments within the math. You hardly want the numerator of the fraction to contain lots of text. But what we need here are EXAMPLES.[/quote]

    I do not uderstand what you mean by Redlining. Whaty I ment is the ability to keep track of individual changes to part of mathematical formulas either for the ability of reversing them of to keep track of what changes are made to a fomula. There might be more possiblities like adding notes within a formula and other stuf that you can do in regular text paragraphs. I find it actually strange you cannot see the benefit of that as those possiblities are some of the basic features of wordprocessing which you cannot do with MathML because it was created for webpages which in general are for editting but for presenting information. I can show an example of a word footnote in a simple math formula but it is a bit bloated by the font info:

    <m:oMathPara><m:oMath><m:r><w:rPr><w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/></w:rPr><m:t>a+b</m:t></m:r><m:r><w:rPr><w:rStyle w:val="Footnotemarking"/><w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/><w:i/></w:rPr><w:footnoteReference w:id="2"/></m:r><m:r><w:rPr><w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/></w:rPr><m:t>+c</m:t></m:r></m:oMath></m:oMathPara>

    [quote]All these are bare assertions. What the ISO/IEC process needs is something to substantiate these assertions. Where are the examples? Unless they are produced soon, large numbers of people in the JTC1 process are going to conclude that MS is unable to substantiate these assertions.[/quote]

    That would only be true if you actually asked Ecma or Micrsoft for such examples but I guess you haven’t because even an math idiot like can make such examples using Word 2007. I never used the math function in office 2007 before but found that you seem to be able to use most other word functionality even within a formula.

  14. Paul Topping says:

    Just to clarify, Science and Nature’s authoring guidelines only have a problem with Word 2007’s new equation feature, not the Equation Editor that is included with earlier versions of Word as well as with Word 2007. Equation Editor is my company’s (Design Science) product that we have licensed to Microsoft since 1991 and is a simplified version of our MathType product. Documents containing equations created with either Equation Editor or MathType, even ones in Word 2007’s docx format should be acceptable to publishers as they can use Word 2007’s ability to save to the old .doc format to get such documents into their workflow. Of course, an author should consult with the publisher to be sure. We have issued a press release that gives more details here: http://www.dessci.com/en/company/press/070622.htm.

    Paul Topping

    President & CEO

    Design Science

  15. Dave S. says:

    hAl,

    Redline is a term common in document editing –

    Google "redlining comments"

    "Redlining" by itself is often describes somewhat amoral or illegal financial operations, but that’s entirely different.

    Redlining – to mark up as with a red pen. A means of indicating where corrections or alterations to a document are desired.

    The assertion that web pages are for presentation only is naive. Web pages are for communication, just like every other method of document presentation.

    The ability to copy a formula from a web page and paste it into another application and have it work the same as it looks is much more important than having it filled with non-formula information that needs to be filtered out.

    Perhaps this is why Microsoft makes their comment balloons in Excel appear separate from the contents of the cell.

    For many formulae one needs to know far too much for it to be splattered within the formula. Einstein’s took about 20 pages to explain E=mc^2 in his book on relativity.

    If MS Office understood the concept of units, that would be something really powerful.

  16. John Scholes says:

    hAl

    I am not sure that you have picked a particularly good example. Maths is replete with subscripts and superscripts, so your example of a footnote reference would almost certainly be a bad idea, even if there were font/color changes – it would get mistaken for a sub/superscript.

    In any case, the middle of a formula is not a good place to put a footnote reference. Surely you would put it in the surrounding text.

  17. John Scholes says:

    Dave S has explained "redlining". I live in London, UK and it is certainly a common term over here. Maybe the US uses something different. I have spent several years in the US, but still get caught out on those kind of things!

    I think we need some examples here. I still think one could easily break the math into islands, each containing no non-math content.

  18. John Scholes says:

    hAl

    You said that it was up to people involved in the ISO fast-track process to ask MS for the evidence on why they were ignoring existing standards, why we needed another document standard as well as ODF etc.

    Well:

    (1) MS has been asked many times, by many people in many fora;

    (2) there are claims within the Ecma standard that it is better for legacy docs, so it is up to Ecma/MS to justfiy these claims;

    (3) it is general ISO/IEC policy to make use of existing standards wherever possible, so it is up to Ecma/MS to justify not using MathML (even without being asked).

    You have a point that technically some of these things should be put to Ecma, not MS. But Ecma still has an obligation to justify these things, even without being asked – if it wants the draft to be approved.

  19. I think it is an excellent idea to solicit examples/counterexamples.  I have posted a blog entry (see link above) requesting examples that show either contention a) that the Ecma standard is better for legacy docs, or b) that there are docs better handled by ODF or simply not handled by either ODF or OOXML.  Both would shed some light on the general idea that OOXML is better for handling legacy MS Office docs.  Please feel welcome to send or post examples that prove or disprove either part of this contention.

  20. Sorry, the link is on my name, not "above".

Skip to main content