A new version of Unicode Technical Note #28, *UnicodeMath, a Nearly Plain-Text Encoding of Mathematics* is now available. It updates several topics and references and uses the name UnicodeMath instead of Unicode linear format. Since there are several math linear formats, such as Nemeth braille, [La]TeX, and AsciiMath, having the name UnicodeMath clarifies the discussion nicely. The text has been polished in other ways too and some errors have been corrected. No notational constructs have been added, so the version number is only incremented to 3.1.

Here’s a UnicodeMath example in case you don’t want to read the whole spec ☺ The formula

sin θ=(e^iθ-e^-iθ)/2i

displays as

Operators and operator precedence are used to delimit arguments. A binary minus has lower precedence than the superscript operator ^ and the fraction operator /, but a unary minus has higher precedence than ^. This approach contrasts with LaTeX and AsciiMath which require that arguments consisting of more than one element be enclosed in {} or (), respectively. In LaTeX, the formula above is given by

\sin\theta=\frac{e^{i\theta}-e^{-i\theta}}{2i}

In AsciiMath, the formula is given by

sin theta=(e^(i theta)-e^(-i theta))/(2i)

In Microsoft Office apps, you can enter Unicode symbols in UnicodeMath using the corresponding [La]TeX controls words such as \theta, using names that you choose, or using symbol galleries.

]]>Let’s start with the Nemeth braille approach used by MathSpeak, though it doesn’t use braille codes. Nemeth braille uses subscript/superscript level shifters. For example,

A sub/sup level stays active until another level is met. The level shifter to go back to the baseline is “baseline”. This is the way the blind mathematician Abraham Nemeth liked to have people speak subscripts and superscripts to him. Back in his day, computer math speech wasn’t available and people read math to him. His sub/sup speech is efficient and unambiguous. He didn’t even say *x*^{2}_{ }as “x squared”, but as “x sup 2”.

This has the advantage that if *x*^{2} is the second component of the vector **x**, it isn’t misidentified as “x squared”. Superscripts don’t always mean powers. For example, the triple scalar product **a****⋅**(**b×c**) of the vectors **a**,** b**,** **and** c** is given by *ε _{ijk} a^{i} b^{j} c^{k}*, where

Furthermore, the level of nested subscripts/superscripts is always clear with level shifters. On the other hand, saying “e to the minus x squared” gives the meaning of that expression without any parsing. A more verbose version of the Nemeth approach is to say “superscript” and “subscript” instead of “sup” and “sub”. Saying the complete words is helpful at first. But as you get familiar with it, the three-letter abbreviations are faster and easier to follow. Too much verbiage gets in the way of comprehending math.

Except for Nemeth himself, the references linked to at the start of this post all say *x*^{2} as “x squared” and *x*^{3} as “x cubed”. Superscripts as indices aren’t common and a little AI could recognize them. ClearSpeak says *x ^{n}* as “x to the nth power”, while I prefer “x to the n”. “nth” requires localization, whereas ‘n’ alone does not. In my lectures on physics over the years, I don’t think I ever added the word “power”. Although grammatically correct, it wastes time, and being grammatically correct isn’t necessarily a goal for math speech. Math speech wants to be efficient and unambiguous, but some degree of abbreviation helps convey the semantics more efficiently. In fact, mathematics owes a significant part of its success to its concise notations. A side benefit of using abbreviated speech is that localization is simplified: you don’t have to worry much about word order differences and declensions.

If you don’t use the sub/sup/base level shifters, how do you handle compound subscripts and superscripts unambiguously? The various math linear formats except for Nemeth braille all handle compound scripts using tree structures such as TeX’s a^{b_2} or UnicodeMath’s a^(b_2) for *a ^{b₂}*. One could speak these characters, but it’s better to speak what they represent since {} and () are used for a variety of syntactic purposes and may be nested. Accordingly, one can say “a to the b sub 2 end sup”. Here UnicodeMath’s ‘(‘ is replaced by “to the” and the ‘)’ is replaced by “end sup”. For

Numeric fractions like ¼ are spoken as “one fourth” and simple fractions like a/b are spoken as “a over b”. A fraction is compound if it contains one or more operators with lower precedence than division, such as (a+b)/c. For compound fractions, the beginning and end of the fraction need to be spoken to differentiate between expressions like a/(b+c) and a/b + c. If you say “a over b plus c” it means a/b + c, since we adopt the usual convention that division has higher precedence than addition. It also helps to pause a bit before saying “plus c”.

In the spirit of announcing the start and end of compound entities, one might want to speak a compound numerator as “numerator…end numerator” and a compound denominator as “denominator…end denominator”. But both ClearSpeak and MathSpeak prefer to speak a compound fraction as

“start fraction <numerator> over <denominator> end fraction”.

This is similar to TeX’s notation “{<numerator>\over<denominator>}” and to the Nemeth braille fraction ⠹ <numerator> ⠌ <denominator> ⠼ . This choice is more efficient when both numerator and denominator are compound. Both approaches allow nesting of fractions. Briefer choices include “frac…over…end frac” and “b frac…over…e frac”.

The last of these is how Abraham Nemeth liked fractions to be spoken. Furthermore, if a fraction contains another fraction, he’d say “b b frac … o over … e e frac” for the outer fraction and “b frac…over…e frac” for the inner fraction. He’d repeat the ‘b’, ‘o’, and ‘e’ as many times as the deepest fraction’s nesting level, like stuttering. MathSpeak has a similar option that uses “start” for ‘b’, “over” for ‘o’ and “end” for ‘e’. Revealing the nesting levels is similar to the way we speak nested parentheses as “open paren”, “open second paren”, “open third paren”, and so forth as in ClearSpeak, but in opposite nesting order. MathSpeak and Nemeth Braille indicate the nesting level of square roots and other roots, but don’t give a way to indicate the nesting level of parentheses.

One ends up with a plethora of choices. Since different folks like different choices, both MathSpeak and ClearSpeak offer several speech options. Some choices can be handled by a verbosity level. But qualitatively different choices might best be handled with settings in a dialog box. Nemeth sub/sup level shifters versus tree speech of compound scripts is an example of the latter.

]]>

In addition to being the most readable linear format, UnicodeMath is the most concise. It represents the simple fraction, one half, by the 3 characters “1/2”, whereas typical MathML takes 62 characters (consisting of the <mml:mfrac> entity). This conciseness makes UnicodeMath an attractive format for storing mathematical expressions and equations, as well as for ease of keyboard entry. Another comparison is in the math structures for the Equation Tools tab in the Office ribbon. In Word, the structures are defined in OMML (Office MathML) and built up by Word, while for the other apps, the structures are defined in UnicodeMath and built up by RichEdit. The latter are much faster and the equation data much smaller. A dramatic example is the stacked fraction template (empty numerator over empty denominator). In UnicodeMath, this is given by the single character ‘/’. In OMML, it’s 109 characters! LaTeX is considerably shorter at 9 characters “\frac{}{}”, but is still 9 times longer than UnicodeMath. AsciiMath represents fractions the same way as UnicodeMath, so simple cases are identical. If Greek letters or other characters that require names in AsciiMath are used, UnicodeMath is shorter and more readable.

Another advantage of UnicodeMath over MathML and OMML is that UnicodeMath can be stored anywhere Unicode text is stored. When adding math capabilities to a program, XML formats require redefining the program’s file format and potentially destabilizing backward compatibility, while UnicodeMath does not. If a program is aware of UnicodeMath math zones (see Section 3.20 of UnicodeMath), it can recover the built-up mathematics by passing those zones through the RichEdit UnicodeMath MathBuildUp function. In fact, you can roundtrip RichEdit documents containing math zones through the plain-text editor Notepad: the math zones are preserved!

As its name implies, AsciiMath uses only ASCII characters, although it converts to MathML with access to a much larger character set. AsciiMath is relatively simple to parse and can handle many mathematical constructs. AsciiMath shares some methodology with UnicodeMath, such as eliminating the outer parentheses in fractions like (a+b)/c when converting to built-up format. AsciiMath is designed to work with a MathML renderer, such as MathJax. In Microsoft Office apps, UnicodeMath builds up to the LineServices math internal format, which represented externally by OMML.

By default, the Office math autocorrect facility contains most [La]TeX math symbol control word definitions such as \beta for β. AsciiMath has a subset of such control words but omits the leading backslash. The user can modify such control words in the Office math autocorrect list or add them explicitly, but it’d probably be worth adding an option to make the leading backslash optional. That would speed up keyboard entry of UnicodeMath via math autocorrect. The RichEdit dll includes the UnicodeMath build up/down facility as well as converters for other math formats, such as MathML and OMML. It would be straightforward to add an option to the RichEdit UnicodeMath facility to accept AsciiMath input in general. Such an option would be handy for people that know AsciiMath.

One C++ oriented autocorrect choice in AsciiMath is that typing != enters ≠. Although I program in C++ almost every day, I think /= is a better choice for entering ≠. For one thing, using != for ≠ complicates typing in an equation like n! = n(n-1)(n-2)…1, which is the main reason we didn’t implement it. But in Office apps this equation can also be entered by typing ! = instead of !=, since math spacing rules insert space between ! and = and the RichEdit UnicodeMath facility automatically deletes a user’s space if typed there (see User Spaces in Math Zones). So, that’s an easy work around for entering an n! equation if one wants to support != for ≠. The RichEdit UnicodeMath facility supports most Unicode negated operators by sequences of / followed by the corresponding unnegated operator as described in the post Negated Operators.

<gripe> Meanwhile the C++ language should recognize ≠, ≤, ≥, and ≡ as aliases for !=, <=, >=, and ==. It seems primitive that C++ doesn’t do so in this Unicode age of computing. At least the C++ editing/debugging environments should have an option to display !=, <=, >=, and == as ≠, ≤, ≥, and ≡. </gripe>

Here’s a table with various formats for the integral

Format |
Representation |

UnicodeMath | 1/2𝜋 ∫_0^2𝜋▒ⅆ𝜃/(𝑎+𝑏 sin𝜃 )=1/√(𝑎^2−𝑏^2 ) |

AsciiMath | 1/(2pi) int_0^(2pi) dx/(a+bsin theta)=1/sqrt(a^2-b^2) |

LaTeX | \frac{1}{2\pi}\int_{0}^{2\pi}\frac{d\theta}{a+b\sin {\theta}}=\frac{1}{\sqrt{a^2-b^2}} |

Note that UnicodeMath binds the integrand to the integral, whereas AsciiMath and LaTeX don’t define the limits of the integrand. The Presentation MathML and OMML for this integral are too long to put into this post.

There is a unicode-math conversion package for Unicode enabled XeTeX and LuaLaTeX. The name UnicodeMath seems sufficiently different from unicode-math that there shouldn’t be any confusion between the two. The unicode-math package supports a variety of math fonts including Cambria Math, Minion Math, Latin Modern Math, TeX Gyre Pagella Math, Asana-Math, Neo-Euler, STIX, and XTIS Math. Did you know there are so many math fonts?

Enjoy the new name UnicodeMath. I am and it already appears near the end of my previous blog post, Nemeth Braille Alphanumerics and Unicode Math Alphanumerics. If you’re interested in the origin of UnicodeMath, read the post How I got into technical WP. The forerunner of UnicodeMath originated back in the early microcomputer days and had only 512 characters consisting of upright ASCII, italics, script, Greek and various mathematical symbols used in theoretical physics. Unicode 1.0 didn’t arrive until 10 years later.

]]>

For the most part, the mappings are straightforward as illustrated in the table below. But due to its generative use of type-form and alphabetic indicators, Nemeth braille encodes some math alphabets not in Unicode, e.g., Greek Script and Russian Script. Meanwhile, Unicode has math double-struck and monospace English alphanumerics, which don’t exist in Nemeth braille. Unicode also has six alphabets that aren’t mentioned in the Nemeth specification but that can be defined unambiguously with Nemeth indicators, namely bold Fraktur (Nemeth calls Fraktur “German”), bold Script, and Sans Serif bold and/or italic. The table below includes unambiguous prefixes for these alphabets chosen such that the Nemeth bold indicator precedes the italic or script indicators, and the Sans Serif indicator precedes the bold indicator. These choices correspond to the orders in which the Unicode math alphabets are named. Changes in this ordering result in alternative prefixes that are also unambiguous, but it seems simpler for implementations and users to standardize on the Unicode name ordering.

The Nemeth specification has Script Greek (in §22) as well as “alternative” Greek letters (in §23). Some of the latter may be referred to as “script”. Specifically, the Unicode math Greek italic letters 𝜃𝜙𝜖𝜌𝜋𝜅 have the alternative counterparts 𝜗𝜑𝜀𝜚𝜛𝜘, respectively. The symbol 𝜗 can be called “script theta”. Since Unicode doesn’t have a math script Greek alphabet, it makes sense to map Nemeth math script Greek letters to the alternative Greek letters, if they exist, on input and to use the Nemeth alternative notation on output. In addition, in Unicode the upper-case Θ has the alternative ϴ. In TeX and Office math, the alternative letters are identified by control words with a “var” prefix, as in \varepsilon for 𝜀 as contrasted with \epsilon for ϵ. Interestingly, modern Greek uses 𝜑 and 𝜀 instead of 𝜙 and 𝜖, but math considers the script versions to be the alternatives.

Nemeth braille has several Russian alphabets (see §22 of the Nemeth spec). These alphabets map to characters in the Cyrillic range U+0410..U+044F. Unicode has no math Russian alphabets, but italic and bold Russian alphabets can be emulated using the appropriate Cyrillic characters along with the desired italic and bold formatting. The Unicode Technical Committee, which is responsible for the Unicode Standard, has not received any proposals for adding Russian math alphabets. At least in my experience, technical papers in Russian use English and Greek letters in math zones. In Russian technical documents, this has the nice advantage of easily distinguishing mathematical variables from normal text.

Unicode has four predefined Hebrew characters in the Letterlike Symbols range U+2135..U+2138: ℵ, ℶ, ℷ, ℸ, respectively. In math contexts, it makes sense to map those Hebrew letters in Nemeth braille to the Letterlike Symbols and to map the other Nemeth Hebrew letters to characters in the Unicode Hebrew range U+05D0..U+05EA. The Unicode Technical Committee has not received any proposals for adding more Hebrew math letters so they probably won’t appear in math zones, except, perhaps, as embedded normal text.

The majority of Unicode math digits can be represented by the appropriate type-form indicator sequences in the table above followed by the numeric indicator ⠼ (if necessary) and the corresponding ASCII digits. For example, a math bold 2 (𝟐—U+1D7D0) can be represented by ⠸ ⠼ ⠆ or “_#2”. This works for the bold and/or sans-serif digits, but not for the double-struck and monospace digits, which have no Nemeth counterparts. Meanwhile Nemeth notation supports italic and bold italic digits, which aren’t in Unicode.

Digits in some math contexts don’t need a numeric indicator, e.g., most digits in fractions, subscripts or superscripts. To optimize common numeric subscript expressions like a_{1}, the numeric indicator *and* the subscript indicator are omitted. In Nemeth ASCII braille, a_{1} is “A1” and in Nemeth braille it’s ⠁ ⠂ . The ASCII braille representation is tantalizing since variables like A1, B2, etc., are used to index spreadsheets and it would be more natural if spreadsheet indices were a_{1}, b_{2}, etc., at least for people with a mathematical background.

In general, Unicode’s math characters are simpler to work with since they can be assigned separate character codes instead of being composed as combinations of 64 braille codes. Unicode has about 2310 math characters (see Math property in DerivedCoreProperties.txt) and to distinguish all of those without indicators would require 12-dot braille! Such a system would be really hard to learn. LaTeX describes characters using control words consisting of a backslash followed by combinations of the 64 ASCII letters. That approach has mnemonic value, but it’s not as concise as the Nemeth braille character code sequences. When you get a feel for the Nemeth approach, a character’s Nemeth sequence gives a good idea of what a character is even if you haven’t encountered it before. UnicodeMath and Nemeth braille are intended to be read by human beings, whereas LaTeX and MathML are intended to be read by computer programs, notwithstanding that some TeXies can read LaTeX pretty fluently! Considering that Unicode math alphabets like double-struck and monospace aren’t yet defined in Nemeth braille, it would be worthwhile to choose appropriate type-form indicators for them. Nemeth math alphabets not in Unicode probably don’t have to be considered unless they show up in published documents.

]]>

First note that Nemeth Braille can be displayed in 6-dot ASCII Braille as shown in this table

The dots are numbered 1..6 starting from the upper left, going down to 3 and continuing with 4..6 in the second column. The letters and numbers look like themselves as do the / and (). The braille cells for 1..9 are the same as those for the letters A..I, but shifted down one row. The cells for the letters K..T are the same as those for A..J but with a lower-left dot (dot 3). Letters are lowercase unless prefixed by a cap prefix code (solo dot 6) or pair of cap prefixes for a span of uppercase letters.

A simple table look up converts Nemeth braille codes to 8-dot Unicode Braille in the U+2800 block. The braille cells for 6-dot braille are the first 64 characters of Unicode braille block. With a little practice you can enter braille codes into Word, OneNote, and WordPad by typing 28xx <alt+x>, where xx is the hex code given by the braille dots. To do this, read dots as binary 1’s and missing dots as 0’s, sideways from right to left, top to bottom. So ⠮ is 101110_{2} = 2E_{16} and the corresponding Unicode character is U+282E.

To get a feel for simple Nemeth braille math, consider the expression 12x^{2}+7xy-10y^{2}. In ASCII Braille it displays as

#12x^2″+7xy-10y^2_4

In Nemeth Braille it displays as

In UnicodeMath and TeX, it displays as 12x^2+7xy-10y^2.

It’s tantalizing that the superscript code ⠘ has the ASCII braille code ‘^’ used by UnicodeMath and [La]TeX. But the subscript code is ⠰, which has the ASCII braille code ‘;’ instead of the ‘_’ used by UnicodeMath and TeX. These braille codes also work differently from the UnicodeMath and TeX superscript/subscript operators in that they are script level shifters that must be “cancelled” instead of being ended. So in the formula above, the Nemeth ‘^’ for the first square is cancelled by the ‘”’, while the ‘+’ terminates the superscript for UnicodeMath and a TeX superscript consists of a single character or an expression of the form {…}. The following table compares how the three formats handle some nested superscripts and subscripts

Here to keep the Nemeth braille code sequences simple, I’ve omitted the Nemeth math italic, English-letter prefix pair ⠨ ⠰ before each math variable. Hopefully there’s a way to make math italic the default, as it is in UnicodeMath, MathML, and TeX, but I didn’t find such a mode in the full specification. A space before literary text terminates the current script level shift, that is, it initiates base level. This is also true for a space that indicates the next column in a matrix, but it’s not true for a function-argument separator as illustrated in the table below. Spaces can also be used for equation-array alignment (you need to think in terms of a fixed-width font).

Simple fractions are written in a fashion similar to TeX’s {<numerator>\over <denominator>}. For example,

or in ASCII braille as ?1/2#. The ⠹ and ⠼ work as the curly braces do in TeX fractions as in {1\over 2}. In UnicodeMath, the fraction is given by 1/2. Fractions can be laid out in a two-dimensional format emulating built-up fractions but using Nemeth braille. Nested fractions require additional prefix codes (solo dot 6). For single-line braille devices it seems worthwhile to use the linear display since the fraction delimiters can be nested to any depth. Stacked, slashed, and linear fractions can be encoded and correspond to those structures in UnicodeMath and in TeX.

The Nemeth alphabets are similar to the Unicode math alphanumerics discussed in Sections 2.1 and 2.2 of Unicode Technical Report #25. One difference is that math script and math italic variants exist for English, Greek, Cyrillic, and German (Fraktur) alphabets, whereas in Unicode math script variants are only available for the English alphabet. We may need to generalize Unicode’s coverage in this area, since TeX also has the ability to represent more math alphabets (see, for example, Unicode Math Calligraphic Alphabets).

At some point, I hope to give a listing of correspondences between UnicodeMath and Nemeth Braille. It’s a long topic, so as a start the following table gives some more examples. Note the spaces needed around the equals sign (and other relational operators), but the lack of a space between the ‘*a’* and “sin” in “*a* sin *x”*. The Nemeth notation is ambiguous with respect to using asin for arc sine.

The Unified English Braille code can handle arithmetic up through elementary algebra, but it’s not general enough to deal with Office math zones. It doesn’t get as far as the square root. So technical documents need a combination of UEB and Nemeth Braille. To see how to embed Nemeth math zones in UEB documents, see Guidance for Transcription Using the Nemeth Code within UEB Contexts.

One possible way to reduce the large number of rules governing Nemeth braille would be to use an 8-dot standard in which math operators could be encoded with the aid of bottom row dots. This would work with current technology since Braille displays let you read and enter all possible 8-dot Braille codes. In fact, dot 7 is sometimes used to change lower case into upper case, thereby not needing an upper-case prefix code (solo dot 6) for upper-case letters.

]]>Understand at the outset that two granularities of math speech are needed: coarse-grained, which speaks math expressions fluently in a natural language, and fine-grained, which speaks the content at the insertion point. The coarse-grained granularity is great for scanning through math zones. It doesn’t pretend to be tightly synchronized with the characters in memory and cannot be used directly for editing. It’s relatively independent of the memory math model used in applications.

In contrast, the fine-grained granularity is tightly synchronized with the characters in memory and is ideal for editing. By its very nature, it depends on the built-up memory math model (described below), which is the same for all Microsoft math-aware products, but may differ from the models of other math products. Coarse grained navigation between siblings for a given math nesting level can be done with Ctrl+→ and Ctrl+← or Braille equivalents, while fine-grained navigation is done with → and ← or equivalents. The latter allows the user to traverse every character in the display math tree used for a math zone. The coarse- and fine-grained granularities are discussed further in the post Math Accessibility Trees. In addition to granularity, it’s useful to have levels of verbosity. Especially when new to a system, it’s helpful to have more verbiage describing an equation. But with greater familiarity, one can comprehend an equation more quickly with less verbiage.

To represent mathematics linearly and unambiguously, UnicodeMath may introduce parentheses that are removed in built-up form. Speaking the introduced parentheses can get confusing since it may be hard for the listener to track which parentheses go with which part of the expression. In the simple example above of (a+b)/2, it’s more meaningful to say “start numerator a plus b end numerator over 2” than to speak the parentheses. Or to be less verbose, leave out the “start”. This idea applies to expressions that include square roots, boxed formulas and other “envelopes” that use parentheses to define their arguments unambiguously. For the UnicodeMath square-root √(a^2-b^2), it’s clearer to say “square root of a squared minus b squared, end square root” instead of “square root of open paren a squared minus b squared close paren”. This is particularly true if the square root is nested inside a denominator as in

which has the UnicodeMath 1/(2+√(a^2-b^2)). By saying “end square root” instead of “close paren”, it’s immediately clear where the square root ends. Simple fractions like 2/3 are spoken using ordinals as in “two thirds”. Also when speaking the UnicodeMath text ∑_(n=0)^∞, rather than say “sum from open paren n equal 0 close paren to infinity”, one should say “sum from n equal 0 to infinity”, which is unambiguous without the parentheses since the “from” and “to” act as a pair of open and close delimiters. This and similar enhancements are discussed in the ClearSpeak specification and in Significance of Paralinguistic Cues in the Synthesis of Mathematical Equations. Such clearer start-of-unit, end-of-unit vocabulary mirrors what’s in memory. The parentheses introduced by UnicodeMath are not in memory since the memory version uses special delimiters as explained below. Parentheses inserted by the user are spoken as “open paren” and “close paren” provided they are the outermost parentheses. Nested parentheses are spoken together with their parenthesis nesting level as in “open second paren”, “open third paren”, etc.

Such refinements can be made by processing the UnicodeMath, but some parsing is needed. It’s easier to examine the built-up version of expressions, since that version is already largely parsed. The built-up format is a *display tree* as described in the post Math Accessibility Trees. For example, to know that an exponent in the UnicodeMath equation a^2+b^2=c^2 is, in fact, a 2 and not part of a larger argument, one must check the character following the 2 to make sure that it’s an operator and not part of the exponent. If the letter z follows the 2 as in a^2z, the z is part of the superscript and the expression should be spoken as “a to the power 2z”. In memory one just checks for a single code, here the end-of-object code U+FDEF. If that code follows the 2, the exponent is 2 alone and “squared” is appropriate, unless exponents are indices as in tensor notation.

The built-up memory format represents mathematical objects like fraction, matrix and superscript by a start delimiter, the first argument, an argument separator if the object has more than one argument, the second argument, etc., with the final argument terminated by the object end delimiter. For example, the UnicodeMath fraction a/2 is represented in the built-up format by {_{frac} *a*|2} where {_{frac} is the start delimiter, | is the argument separator, and } is the end delimiter. Similarly a^2 is represented in the built-up format by {_{sup} *a*|2 }. Here the start delimiter is the same character for all math objects and is the Unicode character U+FDD0 in RichEdit (Word uses a different character). The type of math object is given by a rich-text object-type property associated with the start delimiter as described in ITextRange2::GetInlineObject(). The RichEdit argument separator is U+FDEE and the object end delimiter is U+FDEF. These Unicode codes are in the U+FDD0..U+FDEF “noncharacters” block reserved for internal use only.

Another scenario where the built-up format is very useful for speech is in traversing a math zone character by character, allowing editing along the way. Consider the integral

When the insertion point is at the start of the math zone, “math zone” is spoken followed by the speech for the entire math zone. But at any time the user can enter → (or Braille equivalent), which halts the math-zone speech, enters the numerator of the leading fraction, and speaks “1”. Another → and “end of numerator” is spoken. Another → and “2 pi” is spoken. Another → and “end of denominator” is spoken and so forth. In this way, the user knows exactly where the insertion point is and can edit using the usual input methods.

This approach is quite general. Consider matrices. At the start of a matrix, “*n *× *m* matrix” is spoken, where *n* is the number of rows and* m* is the number of columns. Using →, the user moves into the matrix with one character spoken for each → up until the end of the first element. At that end, “end of element 1 1” is spoken, etc. Up and down arrows can be used to move vertically inside a matrix as elsewhere, in all cases with the target character or end of element being spoken so that the user knows which element the insertion point is in.

Math variables are represented by math alphabetics (see Section 2.2 of Unicode Technical Report #25). This allows variables to be distinguished easily from ordinary text. When converted to speech text, such variables are surrounded by spaces when inserted into the speech text. This causes text-to-speech engines to say the individual letters instead of speaking a span of consecutive letters as a word. In contrast, an equation like rate = distance/time, would be spoken as “rate equals distance over time”. Math italic letters are spoken simply as the corresponding ASCII or Greek letters since in math zones math italic is enabled by default. Other math alphabets need extra words to reveal their differences. For example, ℋ is spoken as “script cap h”. Alternatively, the “cap” can be implied by raising the voice pitch.

Some special cues may be needed to convince text-to-speech engines to say math characters correctly. For example, ‘+’ may need to be given as “plus”, since otherwise it might be spoken as “and”. The letter ‘a’ may need to be enclosed in single quotes, since otherwise it may be spoken as the ‘a’ in “zebra” instead of the ‘a’ in “base”.

Another example of how the two speech granularities differ is in how math text tweaking is revealed. First, let’s define some ways to tweak math text. You can insert extra spaces as described in Sec. 3.15 of the UnicodeMath paper. Coarse-grained speech doesn’t mention such space but fine-grained speech does. More special kinds of tweaking are done by inserting phantom objects. Five Boolean flags characterize a phantom object: 1) zero ascent, 2) zero descent, 3) zero width, 4) show, and 5) transparent. Phantom objects insert or remove precise amounts of space. You can read about them in the post on MathML and Ecma Math (OMML) and in Sec. 3.17 of the UnicodeMath paper. The π in the upper limit of the integral above is inside an “h smash” phantom, which sets the π’s width to 0 (smashes the horizontal dimension). Notice how the integrand starts at the start of the π. Coarse-grained speech doesn’t mention this and other phantom objects and only includes their contents if the “show” flag is set. Fine-grained speech includes the start and end entities as well as the contents. This allows a user to edit phantom objects just like the 22 other math objects in the LineServices math model.

The approaches described here produce automated math speech; the content creator doesn’t need to do anything to enable math speech. But it’s desirable to have override capability, since the heuristics used may not apply or the content author may prefer an alternate phrasing.

]]>As explained in the post Flyweight RichEdit Controls, an important design criterion was to make plain-text editing fast and small. Accordingly, Christian Fortini’s original model for the text pointers into the RichEdit 2.0 backing store gave priority to plain-text controls. Since RichEdit would also be used for rich-text controls, the design had to accommodate rich text as well. The first attempt was the double-diamond, multiple-inheritance, text pointer hierarchy

CTxtSelection → CTxtRange → CTxtPtr

↑ ↑ ↑

CRchTxtSelection → CRchTxtRange → CRchTxtPtr

Here the CTxtPtr class manipulates the Unicode plain text in the memory backing store, the CTxtRange class manipulates ranges of plain text and the CTxtSelection is a CTxtRange that has added user-interface functionality such as keyboard and mouse handling. The rich-text row in the hierarchy adds the ability to manipulate text runs with different character and paragraph formatting. I implemented this hierarchy back in 1995 partly as an exercise in learning C++. Up to then the only major C++ feature not in C that I had used was operator overloading for handling complex arithmetic elegantly and efficiently in quantum optics calculations.

The double-diamond hierarchy worked. Nevertheless, it seemed overly complex, so one weekend Alex Gounares simplified it to the simple single-inheritance model

CTxtSelection → CTxtRange → CRchTxtPtr

in which CRchTxtPtr contains a CTxtPtr text-run pointer along with similar run pointers for character formatting and paragraph formatting. The resulting riched20.dll went from 145KB down to 90KB! (Now it’s 2.5 MB!) There was a bunch of hidden overhead in the multiple inheritance hierarchy. For sufficiently simple text, the single-inheritance model didn’t instantiate any formatting runs, which boosted performance for plain text, a goal of the original model. Ironically the double-diamond inheritance hierarchy turned out to be a bad approach also from a functional point of view, since a multilingual plain-text editor needs multiple fonts to handle multiple fonts and proofing tools need some text run character formatting. As such any international plain-text editor must have at least some degree of richness.

RichEdit 2.0 also shipped with Version 1 of the Text Object Model (TOM). This object model includes the ITextSelection and ITextRange interfaces. CTxtSelection inherits from CTxtRange, since it’s adding UI functionality to a range. Meanwhile ITextSelection inherits from ITextRange. So how can CTxtSelection inherit from ITextSelection without another diamond? For RichEdit up through version 5, we would have had

ITextSelection → ITextRange

↑ ↑

CTxtSelection → CTxtRange → CRchTxtPtr

The single inheritance solution for the ranges was to have CTxtRange inherit from ITextSelection and have it return E_NOTIMPL for the selection-specific UI methods. This gives the simplified inheritance

CTxtSelection → CTxtRange → ITextSelection, CRchTxtPtr

RichEdit 6.0 added several more TOM interfaces including ITextRange2 and ITextSelection2. To avoid diamond inheritance, ITextRange2 inherits from ITextSelection and ITextSelection2 inherits from ITextRange2. Unlike ITextSelection, ITextSelection2 doesn’t add any methods to ITextRange2. Starting with RichEdit 6.0, CTxtRange inherits from ITextSelection2 and CTxtSelection continues to inherit from CTxtRange. CTxtRange also inherits from CRchTxtPtr, which has some virtual methods, but the overhead for switching “this” pointers is substantially less than it would be with diamond inheritance.

There are other C++ areas with surprise overhead. Smart pointers have become popular since they don’t need explicit clean up, even when exceptions are thrown. But despite clever operator overloading, smart pointers involve a new language to learn and result in extra steps in debugging. Smart pointers are built into C++ for Windows Universal apps and use the ^ to indicate the smart pointer. Meanwhile in spite of a plethora of new notation, templates, and classes, C++ operators are still mired in the ASCII world, using <= for ≤ and != for ≠. Why not accept these well-defined operators as aliases for the original ASCII operator sequences?

Some habits don’t result in code bloat, but can slow down reading and code maintenance. Some people treat C++ like Lisp, sticking in quantities of unnecessary parentheses and curly braces. When there’s more syntactic sugar to read, you have to wade through it. Mathematics is successful in part due to its conciseness. (Although one shouldn’t be concise to the point of inscrutability). One good technique in writing functions is to return as soon as the results or an error are found. Often you find code where the only return is at the end of a function which may force the code to be deeply nested in hard to follow curly braces.

There’s a moral to be learned from the RichEdit text pointer design: keep things simple and easy to read. Avoid multiple inheritance (except for interfaces) unless it dramatically improves your model. And in any event, avoid diamond inheritance!

]]>

More than one kind of tree is possible and this post compares two possible kinds using the equation

We label each tree node with its math text in the linear format along with the type of node. The linear format lends itself to being spoken especially if processed a bit to say things like “*a*^2” as “a squared” in the current natural language. The first kind of tree corresponds to the traditional math layout used in documents, while the second kind corresponds to the mathematical semantics. Accordingly we call the first kind a *display tree* and the second a *semantic tree*.

More specifically, the first kind of tree represents the way TeX and Microsoft Office applications display mathematical text. Mathematical layout entities such as fractions, integrals, roots, subscripts and superscripts are represented by nodes in trees. But binary and relational operators that don’t require special typography other than appropriate spacing are included in text nodes. The display tree for the equation above is

Note that the invisible times between the leading fraction and the integral isn’t displayed and the expression *a*+*b* sin*θ* is displayed as a text node *a*+*b* followed by a function-apply node sin*θ*, without explicit nodes for the + and the invisible times.

To navigate through the *a*+*b* and into the fractions and integral, one can use the usual text left and right arrows or their braille equivalents. One can navigate through the whole equation with these arrow keys, but it’s helpful also to have tree navigation keys to go between sibling nodes and up to parent nodes. For the sake of discussion, let’s suppose the tree navigation hot keys are those defined in the table

Ctrl+→ | Go to next sibling |

Ctrl+← | Go to previous sibling |

Home | Go to parent position ahead of current child |

End | Go to parent position after current child |

For example starting at the beginning of the equation, Ctrl+→ moves past the leading fraction to the integral, whereas → moves into the numerator of the leading fraction. Starting at the beginning of the upper limit, Home goes to the insertion point between the leading fraction and the integral, while End goes to the insertion point in front of the equal sign. Ctrl+→ and Ctrl+← allow a user to scan an equation rapidly at any level in the hierarchy. After one of these hot keys is pressed, the linear format for the object at the new position can be spoken in a fashion quite similar to ClearSpeak. When the user finds a position of interest, s/he can use the usual input methods to delete and/or insert new math text.

Now consider the semantic tree, which allocates nodes to all binary and relational operators as well as to fractions, integrals, etc.

The semantic tree has two drawbacks: 1) it’s bigger and requires more key strokes to navigate and 2) it requires a Polish-prefix mentality. Some people have such a mentality, perhaps having used HP calculators, and prefer it. But it’s definitely an acquired taste and it doesn’t correspond to the way that mathematics is conventionally displayed and edited. Accordingly the display tree seems significantly better for blind reading and editing, as well as for sighted editing.

Both kinds of trees include nodes defined by the OMML entities listed in the following table along with the corresponding MathML entities

Built-up Office Math Object | OMML tag | MathMl |

Accent |
acc | mover/munder |

Bar |
bar | mover/munder |

Box |
box | menclose (approx) |

BoxedFormula |
borderBox | menclose |

Delimiters |
d | mfenced |

EquationArray |
eqArr | mtable (with alignment groups) |

Fraction |
f | mfrac |

FunctionApply |
func | &FunctionApply; (binary operator) |

LeftSubSup |
sPre | mmultiscripts (special case of) |

LowerLimit |
limLow | munder |

Matrix |
m | mtable |

Nary |
nary | mrow followed by n-ary mo |

Phantom |
phant | mphantom and/or mpadded |

Radical |
rad | msqrt/mroot |

GroupChar |
groupChr | mover/munder |

Subscript |
sSub | msub |

SubSup |
sSubSup | msubsup |

Superscript |
sSup | msup |

UpperLimit |
limUpp | mover |

Ordinary text |
r | mrow |

MathML has additional nodes, some of which involve infix parsing to recognize, e.g., integrals. The OMML entities were defined for typographic reasons since they require special display handling. Interestingly the OMML entities also include useful semantics, such as identifying integrals and trigonometric functions without special parsing.

In summary, math zones can be made accessible using display trees for which the node contents are spoken using in the localized linear format and navigation is accomplished using simple arrow keys, Ctrl arrow keys, and the Home and End keys, or their Braille equivalents. Arriving at any particular insertion point, the user can hear or feel the math text and can edit the text in standard ways.

I’m indebted to many colleagues who helped me understand various accessibility issues and I benefitted a lot from attending the Benetech Math Code Sprint.

]]>This post deals with a problem I’ve had that doesn’t occur with RichEdit font binding, but does happen in Word and Outlook. Often I need to document a particular Unicode character such as ⬚, U+2B1A which is used as a place holder in empty math objects, or <the new blog editor can’t handle U+20000>, U+20000 which is the first Unicode plane-2 character. To enter such characters, I type the Unicode hex value followed by alt+x as described in the post Entering Unicode Characters. If you do this in WordPad (which uses RichEdit) and continue typing, the font changes from Calibri to Cambria Math for ⬚ and to SimSun-ExtB for U+20000 and then switches back to Calibri for the subsequent text.

But in Outlook and Word the font switches to these other fonts and then continues to use them as long as the new font has the characters you type. The problem is that ASCII letters are supported in the vast majority of fonts, so invoking the rule “stick with the current font as long as it supports the characters” is insufficient for proper font binding. You can work around this error by using the handy Format Painter tool on the Home tab to restore the original font to subsequent text or more easily by typing on both sides of the character’s hex code before typing the alt+x hot key after the hex code.

Interestingly, if you paste a plain-text string containing such characters into Word, e.g., from NotePad, only the fonts for the special characters change. But relying on NotePad for entering such mixed text isn’t practical since NotePad doesn’t support alt+x. PowerPoint doesn’t have the problem since it doesn’t support alt+x either, sigh. (We might add the alt+x hot key to PowerPoint someday…)

There are a couple of ways to avoid this pitfall. RichEdit has the CHARFORMAT2 attribute CFE_FONTBOUND, which marks a run as being font bound when a different font is used to display a character. As such the font-bound font has lower priority for subsequent font binding than the previous font. Also if the font fix up occurs just as the text is input into the RichEdit backing store, it doesn’t change the selection’s current font. Both of these choices result in the font being restored to the previous font after a special character is font bound.

Another problem with Word’s font binding is that it switches to SimSun or if you enter a right arrow like → (U+2192). This is annoying especially since Calibri and most other Latin fonts support the simple arrows ←↑→↓, so no font binding is needed. This font switch occurs for both alt+x entry and pasting. But at least the font switches back to the Latin font after the arrow symbol is stored. Hopefully we’ll fix these problems before too long.

RichEdit font binding is overruled in the XAML edit controls, TextBox and RichEditBox, partly to maintain consistency with the companion TextBlock and RichTextBlock controls. A similar consistency is desired in Excel spreadsheets. A future post will describe how these approaches work.

]]>

First, here’s an example of script and calligraphic F’s being used in the same document:

And here are examples featuring P’s and C’s in which script letters denote infinity categories

Accordingly the need for both script and calligraphic alphabets is attested.

Let’s turn now to the unfortunate fact that the current math script alphabets may be fancy script in one font and calligraphic in another. Cambria Math, the first widely used Unicode math font, has calligraphic letters at the math script code points, while STIX and the Unicode Standard have fancy script letters at those code points. For example, here’s the upper-case math script H (U+210B) in Cambria Math followed by the one in STIX:

We really can’t change Cambria Math’s math script alphabet choice at this late stage in computing history; too many documents use it. Consequently it is inadequate to add only bold and regular Calligraphic alphabets, expecting the current bold and regular script alphabets to fulfil the need for bold and regular math script alphabets. Unfortunately, the latter are deliberately ambiguous with respect to calligraphic versus script.

There are two unambiguous ways to allow math script and math calligraphic symbols to appear in the same plain text document:

1) Follow a character in the current math script alphabets with one of two variation selectors similar to the way we use variation selectors (U+FE0E, U+FE0F) for emoji to force text and emoji glyphs, respectively. Specifically, to ensure use of the math calligraphic alphabet, follow the current math script letter with U+FE00. To ensure use of the math fancy script alphabet, follow the current math script letters with U+FE01.

2) Add four new unambiguous math alphabets: bold and regular, fancy script and calligraphic, leaving the current math script alphabets as ambiguous.

The variation selector choice has the advantages

a) Contemporary software supports variation sequences for East Asia and emoji, so adding new variation sequences shouldn’t be much of a burden

b) The variation selector U+FE00 is already used with a number of math operators

c) No new code points need to be allocated

d) Typical documents can continue to do what they have been doing: ignore the distinction

e) If a math font doesn’t support the variation sequences, it falls back naturally to the current script/calligraphic letters instead of displaying the missing-glyph box

These advantages together with the fact that the majority of documents don’t require a script/calligraphic distinction seem to make the variation selector approach preferable. Adding two variation selectors for the math script letters may make people ask why the math alphabets weren’t implemented with variation selectors in the first place. They were considered, but the Unicode Technical Committee was concerned that people might misuse them to encode rich-text properties which are not the domain of plain text. Adding two variation selectors seems to solve the present calligraphic quandary quite well, although the use of variation selectors is generally a poor one for situations where symbol shapes need to be used in a contrastive manner. This case should therefore not serve as a general precedent, but should be seen as an exception, tailored to fit this specific case.

In fact, LaTeX has the \mathsf{} and \mathsfit{} control words for math sans serif upright and italic characters, respectively, and they work with Greek letters. Unlike the calligraphic/script distinction which is seldom used contrastively, upright and italic are usually used contrastively in mathematics. Unicode has normal weight upright and italic sans serif math alphabets corresponding to the ASCII letters, but not for the Greek letters. Accordingly, these two math Greek alphabets will probably be added, perhaps in the range U+1D3F80..U+1D3FF. This range has been reserved for math alphanumeric symbols and immediately precedes the Mathematical Alphanumeric Symbols block at U+1D400..U+1D7FF.

It might also be worthwhile for programs like Word to have a math document-level property that specifies which script/calligraphic alphabet to use for the whole document. Then a user who wants the fancy script glyphs could get them without making any changes except for choosing the desired document property setting. A similar setting could be used for choosing sans serif alphabets as the default. It appears such alphabets are often used in chemical formulas.

The choice of calligraphic glyphs for the math script letters in Cambria Math is partly my fault. I had expected to see fancy script letters in Cambria Math as in the Unicode code charts. In my physics career I used math script letters a lot, starting with my PhD thesis on laser theory (1967) and followed by many published papers in the Physical Review and elsewhere and in my three books on lasers and quantum optics. Occasionally in a review article, calligraphic letters were substituted for the fancy script letters because the publishers didn’t have the latter. And in the early days, the IBM Selectric Script ball and the script daisy wheels only had calligraphic letters. So I kind of got used to this substitution.

In addition, Cambria Math was designed partly to look really good on screens, which didn’t have the resolution to display the narrow stem widths of Times New Roman and fancy script letters well. ClearType rendering certainly helped, but it seemed like a good idea to use less resolution demanding calligraphic letters. (Later Word 2013 disabled ClearType for various reasons and many readers of this blog have complained passionately ever since! With high resolution screens as on my Samsung laptop and the Surface Book, even Times New Roman looks crisp and nice with only gray-scale antialiasing, so hopefully this problem will diminish in time.) In contrast, it’s appropriate that the STIX font, based on Times Roman with its narrow glyph stems, would have the fancy script glyphs. With the mechanism described here, people could use calligraphic and script letters contrastively in the same document (assuming the fonts add the missing glyphs).

]]>