Math Accessibility Trees


This post discusses some aspects of making mathematical equations accessible to blind people. Presumably equations that are simple typographically, such as E = mc², are accessible with the use of standard left and right arrow key navigation and with each variable and two-dimensional construct being spoken when the insertion point is moved to them. At any particular insertion point, the user can edit the equation using the regular input methods, perhaps based on the linear format and Nemeth Braille or Unified English Braille keyboards. But it can be hard to follow a more typographically complex equation, let alone edit it. Instead, the user needs to be able to navigate such an equation using a mathematical tree of the equation.

More than one kind of tree is possible and this post compares two possible kinds using the equation

integral

We label each tree node with its math text in the linear format along with the type of node. The linear format lends itself to being spoken especially if processed a bit to say things like “a^2” as “a squared” in the current natural language. The first kind of tree corresponds to the traditional math layout used in documents, while the second kind corresponds to the mathematical semantics. Accordingly we call the first kind a display tree and the second a semantic tree.

More specifically, the first kind of tree represents the way TeX and Microsoft Office applications display mathematical text. Mathematical layout entities such as fractions, integrals, roots, subscripts and superscripts are represented by nodes in trees. But binary and relational operators that don’t require special typography other than appropriate spacing are included in text nodes. The display tree for the equation above is

DisplayTree1

Note that the invisible times between the leading fraction and the integral isn’t displayed and the expression a+b sinθ is displayed as a text node a+b followed by a function-apply node sinθ, without explicit nodes for the + and the invisible times.

To navigate through the a+b and into the fractions and integral, one can use the usual text left and right arrows or their braille equivalents. One can navigate through the whole equation with these arrow keys, but it’s helpful also to have tree navigation keys to go between sibling nodes and up to parent nodes. For the sake of discussion, let’s suppose the tree navigation hot keys are those defined in the table

Ctrl+→ Go to next sibling
Ctrl+← Go to previous sibling
Home Go to parent position ahead of current child
End Go to parent position after current child

For example starting at the beginning of the equation, Ctrl+→ moves past the leading fraction to the integral, whereas → moves into the numerator of the leading fraction. Starting at the beginning of the upper limit, Home goes to the insertion point between the leading fraction and the integral, while End goes to the insertion point in front of the equal sign. Ctrl+→ and Ctrl+← allow a user to scan an equation rapidly at any level in the hierarchy. After one of these hot keys is pressed, the linear format for the object at the new position can be spoken in a fashion quite similar to ClearSpeak. When the user finds a position of interest, s/he can use the usual input methods to delete and/or insert new math text.

Now consider the semantic tree, which allocates nodes to all binary and relational operators as well as to fractions, integrals, etc.

SemanticTree1

The semantic tree has two drawbacks: 1) it’s bigger and requires more key strokes to navigate and 2) it requires a Polish-prefix mentality. Some people have such a mentality, perhaps having used HP calculators, and prefer it. But it’s definitely an acquired taste and it doesn’t correspond to the way that mathematics is conventionally displayed and edited. Accordingly the display tree seems significantly better for blind reading and editing, as well as for sighted editing.

Both kinds of trees include nodes defined by the OMML entities listed in the following table along with the corresponding MathML entities

Built-up Office Math Object OMML tag MathMl
Accent         acc mover/munder
Bar bar mover/munder
Box         box menclose (approx)
BoxedFormula     borderBox menclose
Delimiters d mfenced
EquationArray eqArr mtable (with alignment groups)
Fraction f mfrac
FunctionApply func &FunctionApply; (binary operator)
LeftSubSup sPre mmultiscripts (special case of)
LowerLimit limLow munder
Matrix m mtable
Nary nary mrow followed by n-ary mo
Phantom phant mphantom and/or mpadded
Radical rad msqrt/mroot
GroupChar groupChr mover/munder
Subscript sSub msub
SubSup sSubSup msubsup
Superscript sSup msup
UpperLimit limUpp mover
Ordinary text r mrow

 

MathML has additional nodes, some of which involve infix parsing to recognize, e.g., integrals. The OMML entities were defined for typographic reasons since they require special display handling. Interestingly the OMML entities also include useful semantics, such as identifying integrals and trigonometric functions without special parsing.

In summary, math zones can be made accessible using display trees for which the node contents are spoken using in the localized linear format and navigation is accomplished using simple arrow keys, Ctrl arrow keys, and the Home and End keys, or their Braille equivalents. Arriving at any particular insertion point, the user can hear or feel the math text and can edit the text in standard ways.

I’m indebted to many colleagues who helped me understand various accessibility issues and I benefitted a lot from attending the Benetech Math Code Sprint.


Comments (0)

Skip to main content