Equation Numbering Prototype

When writing the Math in Office 2010 post back in July, I could just imagine the disappointment various people would have when they discovered no mention of equation numbering. After getting math into PowerPoint, equation numbering had been the most often requested feature. Since PowerPoint 2010 now has the math facility, equation numbering has risen to the top of the wish list. Note that there is a way to display and manage equation numbering in Word as described in the work of Dong Yu. To get a feel for a native numbering facility, I implemented a prototype in RichEdit that interfaces to the underlying Page/TableServices (PTS) math handler’s equation numbering facility. This post describes that approach. It may not be the one we ship someday, but it does work pretty well. First I describe how equation numbers are represented in the file format and in memory and then consider equation number management.

 

The file format is one area a new feature has to consider. It would be nice not to change the file format for equation numbers, since it makes backward compatibility just that much harder to deal with. For example, it would be very convenient if equation numbers from a future version of Word would display reasonably well in Word 2010 or Word 2007, admittedly with small modifications to those versions. So I examined the nooks and crannies of the Office math format, OMML, for a math object that could be used to contain an equation number.

 

Lurking in that object space is a “no-op” phantom. To understand what this object is and why no one is likely to have used it, we need to know what a phantom object is. The phantom is characterized by five Boolean flags: 1) zero ascent, 2) zero descent, 3) zero width, 4) show, and 5) transparent. You can read about it in the post on MathML and Ecma Math (OMML) and in Sec. 3.17 of the linear format paper. The “zero” flags are handy for suppressing any combination of three object dimensions. Such suppression is called “smashing” by Donald Knuth. You can “smash” the descent of a character with a descender like y so that it has the same size as one without a descender like x. To facilitate entering smashes, we have the control words \asmash, \dsmash, and \hsmash to smash the ascent, descent, and width, respectively, of the phantom argument. Smashes have the show flag on, since the idea is to show the argument, but give it one or more zero dimensions. If you have the show flag off, you have a true phantom, in that space is taken up by the argument, but nothing is displayed. The fifth flag, the transparent flag, means that the argument is treated by the surrounding environment as if no object were there. So an equals sign inside a transparent phantom has math spacing appropriate for an equal sign outside a math object, apart from changes imposed by smashing various dimensions.

 

A no-op phantom is one that displays its argument as if the phantom weren’t even there: the argument has its true dimensions and it is transparent. Since the no-op phantom doesn’t serve any purpose, we might as well give it one: house an equation number. That works for the file format. Now let’s see how it works for actual display in a document.

 

The PTS math handler which ships with Office 2007 and Office 2010 has a number of callbacks for equation numbers. We just need to answer the questions appropriately and presto; the equation number will be displayed accordingly. To make sophisticated choices such as placing the numbers on the left hand side of equations instead of the right hand side, one needs to have some document properties. Ignoring such generality for the moment, we tell the PTS math handler to display the equation number flushed to the right on the last line of an equation. An equation may take up several lines and the PTS math handler can center the equation number vertically, if desired. Secondly, we place the no-op phantom as if it were an equation itself directly following the equation to be numbered. As such the no-op phantom is the only thing inside a soft paragraph, separated from its equation by a Shift+Enter (ASCII VT character). The PTS callbacks needed to be generalized slightly to deal with this special kind of “equation”, but it’s straightforward.

 

The last thing to consider is how to manage equation numbers, namely insert, delete, and edit them. Section 3.21 of the linear format paper explains how to type in an equation number, namely enter the equation followed by a # (U+0023) followed by the desired equation number text and type a [Shift]+Enter. The numbers can be edited in place. But underneath one needs renumbering and synchronization with inline equation number references. Such management could be accessed via a context-menu or math-zone acetate drop down menu. Insert and Delete would work with a single click and edit would open a dialog with options such as whether to include chapter and section numbers and whether the resulting entries should be separated by periods or hyphens. Word would likely treat the equation numbers as bookmarks so that links to them would automatically show the current number. Alternatively, the insert/delete/edit command handler could update all inline no-op phantoms to agree with the corresponding no-op phantoms inside math paragraphs.

 

MathML doesn’t have the concept of a math paragraph, but when embedded in a parent format, the math paragraph can be emulated fairly easily. Basically a math paragraph is just one or more math zones separated by soft paragraph breaks. So the equation number for a displayed MathML math zone (<math>…</math>) would follow the math zone. MathML has an <mphantom>, which suppresses the display of its argument, but keeps its size. Since we want to show equations numbers, <mphantom> isn’t useful for housing an equation number. <mpadded> has the other attributes of the OMML <phantom> except for transparent and if it just displays its argument with no changes, it could be used to house an equation number. In addition, it’s legal to add any attributes that one wants to a MathML element provided the attributes are in a private namespace, so one can be quite precise as far as a given family of implementations go. To be interoperable, MathML would have to add its own way of representing equation numbers, but hasn’t done so in MathML 2.0 or 3.0.