Directionality in Math Zones


In most places, mathematical text is written “left to right” (LTR). For example, in the expression x + y the plus is displayed to the right of the x and the y is displayed to the right of the plus. But in some Arabic locales, mathematical text is written right to left (RTL). Instead of E = mc2, one would see 2cm = E, although the letters would be Arabic, not Latin.

In such RTL locales, square roots are mirrored, so that the surd symbol is flipped relative to the vertical axis. Similarly integral signs are mirrored, although the circular arrows in contour integrals are not mirrored, since they pertain to the 2D complex plane, not the 2D text plane.

The Presentation MathML 3.0 specification provides for RTL math zones. In fact, it allows a dir = “ltr” or “rtl” attribute on the top level <math> element as well as on <mrow>, <mstyle> and token elements like <mi>. Except in rare cases, only the <math> direction need be specified, since all the elements inside have the same directionality (see Section 3.15 of the MathML 3.0 specification). The specification has now undergone Last Call status and so we need to have implementations of the new features. Accordingly I’m interested in implementing at least part of the RTL functionality, namely RTL math zones.

First, consider what an LTR math zone is. This is what Word 2007 and the Office 2010 applications implement. It does have RTL text whenever Arabic or standard Hebrew characters appear adjacent to one another. But all operators and other “neutral” characters are considered to be “strong LTR”, that is, they are displayed to the right of the character that precedes them. This can be quite different from a display that obeys the Unicode Bidirectional Algorithm. A sequence of digits is always displayed LTR, regardless of the character that precedes it even outside math zones and according to the Unicode bidi algorithm. Inside LTR math zones a sequence of digits is displayed to the right of the character that precedes it even if that character is Arabic. According to the Unicode bidi algorithm, a number following an Arabic character is displayed to the left of the Arabic character in both LTR and RTL paragraphs. Inside embedded normal text in a math zone, the usual rules for bidi text are followed. Note that except for such text, the math-zone bidi rules are much simpler than those of the Unicode bidi algorithm, which gets quite tricky in complicated scenarios.

Perhaps you noticed the term “standard Hebrew characters” above. By this I mean all Hebrew characters except the four Hebrew letter-like math symbols ALEF SYMBOL, BET SYMBOL, GIMEL SYMBOL, and DALET SYMBOL (U+2135..U+2138). These symbols are strong LTR characters, unlike their HEBREW LETTER counterparts located in the Unicode Hebrew block (U+0590..U+05FF).

Analogously in an RTL math zone and in the absence of directional overrides, operators and other neutrals are treated as strong RTL characters. A sequence of digits is still displayed LTR, but it appears on the left of the character that precedes it even if that character is Latin. Sequences of Arabic and standard Hebrew letters are RTL as usual. At least that’s how I think a typical RTL math zone should be displayed.

This description of math-zone directionality is somewhat simplified compared to the generality encountered in the real world. To see some of the special cases that can happen, please read the papers by Azzeddine Lazrek:

http://www.ucam.ac.ma/fssm/rydarab/doc/unicode/amassf.doc

http://www.ucam.ac.ma/fssm/rydarab/doc/unicode/amasl.pdf

http://www.ucam.ac.ma/fssm/rydarab/doc/unicode/amdsl.pdf

http://www.ucam.ac.ma/fssm/rydarab/doc/unicode/others.pdf

http://www.ucam.ac.ma/fssm/rydarab/doc/communic/unicodem.pdf

http://www.w3.org/TR/arabic-math/

http://www.ucam.ac.ma/fssm/rydarab/

 

The following review papers are excellent sources for overviews of RTL math:

http://en.wikipedia.org/wiki/Modern_Arabic_mathematical_notation

http://www.ima.umn.edu/2006-2007/SW12.8-9.06/activities/Lazrek-Azzeddine/MathArabIMAe.pdf




Comments (4)

  1. Koby Kahane says:

    Are there any changes with regard to support for RTL text inside math zones in Office 2010?

    In my Probability class, which I type with Word 2007, the professor would often write something like P(<name of some event>)=<some expression>, where P is a probability function and some expression is Western-style LTR Math, but the name of the event is some descriptive text in Hebrew and should be RTL. However, as soon as there’s a space in <name of some event>, the first word would appear to the left and the second to the right instead of being ordered correctly as an RTL sentence.

    A work around is to go in and out of math zone when entering RTL phrases, but this is pretty clumsy, may cause all sorts of layout issues and of course isn’t needed when intermixing LTR text with math.

  2. MurrayS says:

    Put the Hebrew text inside double quotes to format it as Normal Text inside a math zone. Or equivalently, select the Hebrew text and click the Normal Text button on the math ribbon. Normal text is laid out using the usual bidi rules instead of LTR math zone rules.

  3. It’s great to see you a working to solve the problems with Math in Word. I have a couple of other problems I had hoped were corrected in sp2.

    They are both related to using shift-enter with equations.

    pressing shift-enter twice followed by an equation will make it impossible to save the document. (Not nice when you have worked up a long document) It can be saved as Word 2003 format however. This error can come in other situations as well though.

    When entering multiple equations I use shift-enter to automatically get a new equation on the next line. If I later want to make them into one equation again I use backspace at the start of an equation to delete the shift-enter. However if the cursor is outside the equation when pressing backspace the resulting equation is messed up. It looks OK but the box around the equation is not right.

    Only solution is to delete the equations and type again.

    I hope these issues will be adressed in Office 2010.

  4. MurrayS says:

    I tried to reproduce these Shift+Enter problems with Word 2010 Beta 2, and wasn’t able to. So I believe the problems have been fixed. Sorry for the inconvenience with Word 2007, with which I was able to reproduce the problems.