Microsoft Office Math Speech


Microsoft Office math-aware applications can now speak math in over 18 different languages! Try it out with native math zones in Word by enabling Narrator (type CapsLock + Enter) and navigate a math zone as described in the post Speaking of math… There are two math-speech granularities: coarse-grained (navigate by words), which speaks math expressions fluently in a natural language, and fine-grained (navigate by characters), which explains the content at the insertion point (IP) in sufficient detail to enable editing. I can turn off the computer screen and use a keyboard to edit complicated equations accurately by listening to the math speech. Math speech works for all math zones and doesn’t need extra editing by the document author(s). As of this post, Office math speech has been shipping for over a month on Windows, and Word’s math speech, in particular, has already gotten a lot of use. Note that this math facility is built into Office applications (type Alt+= to insert a math zone) and differs from MathType, which can also be used with Office applications.

Coarse-grained speech isn’t tightly synchronized with the characters in memory and cannot be used directly for editing. It’s relatively independent of the memory math model. In contrast, fine-grained speech is tightly synchronized with the characters in memory and is ideal for editing. It depends on the built-up math model (“Presentation Math”), which is the same for all Microsoft math-aware products but may differ from the models of other math products. Coarse grained navigation between siblings for a given math nesting level can be done with Ctrl+→ and Ctrl+← or Braille equivalents, while fine-grained navigation is done with → and ← or equivalents. The latter allows the user to traverse every character in a math zone. Two special cases are 1) when the IP is directly before the math zone being queried by UIA and 2) when the IP is still in the range’s math zone, but at the end. For 1) the user needs to know that typing something won’t be in the math zone. Typing then puts the IP into the math zone and typing enters characters inside the math zone. And for 2), the user needs to know that the IP is at the end of the math zone and still in the math zone. Case 1) returns “equation” followed by the speech for the math zone. Case 2 returns “end equation”. (Since many math zones aren’t equations, this choice of words might be a little misleading sometimes, but hopefully not too much so).

The languages with math speech support include Danish (da-DK), German (de-DE), English (en-US), Spanish (es-ES), Finnish (fi-FI), French (fr-FR), Italian (it-IT), Japanese (ja-JP), Korean (ko-KR), Norwegian (nb-NO), Dutch (nl-NL), Polish (pl-PL), Brazil Portuguese (pt-BR), Portugal Portuguese (pt-PT), Russian (ru-RU), Swedish (sv-SE), Turkish (tr-TR), PRC Chinese (zh-CN), Taiwan Chinese (zh-TW).

Producing Math Speech

Math speech is produced by “building down to speech”, sharing the code and concepts of building down “Presentation Math” to UnicodeMath. This approach creates math speech just as fast as it creates UnicodeMath and is faster than representing math zones in other math formats like MathML. A string of language tokens is created and then converted to the active natural language.

On a technical level, math speech is implemented in the RichEdit dll (Office’s riched20.dll) by the GetMathSpeechText function, which has the prototype

HRESULT GetMathSpeechText (ITextRange2 *prg, BSTR *pbstr, LONG Flags)

Coarse-grained math speech is returned in *pbstr if the range prg selects more than one character while fine-grained speech is returned if prg references an insertion point or selects only one character. GetMathSpeechText() uses the same subset of ITextRange2 methods used by MathBuildDown() and hence can be used by all Microsoft Office math-aware applications on all major platforms (Windows, iOS, Mac, and Android). Key methods include ITextRange2::GetChar2() to fetch individual characters from memory and ITextRange2::GetInlineObject() to find out what kinds of math objects are in memory.

Exposing Math Speech to Assistive Technologies

Math speech is exposed to UI Automation clients via methods of the UIA interface ITextRangeProvider. So, in principle any AT that uses these methods automatically gets math speech for math zones. Nevertheless, it’s desirable for AT’s to know if math zones are involved. One approach is to identify math zones by a new, explicit UIA math-zone object or by a custom object with a localized name like “math zone”. But a more efficient approach that mirrors what’s in memory is to have a math-zone format attribute. Specifically, TextUnit_Format is one of the units supported by ITextRangeProvider::ExpandToEnclosingUnit and ITextRangeProvider::MoveEndpointByUnit. To find out an attribute, such as UIA_IsItalicAttributeId, of a TextUnit_Format instance, a client calls  ITextRangeProvider::GetAttributeValue. AT’s could know if a math zone is active if a new attribute ID, UIA_IsMathZoneAttributeId, is added to identify math zones. In fact, there is a new UIA math annotation property AnnotationType_Mathematics, which can be retrieved by calling ITextRangeProvider::GetAttributeValue(). 


Comments (6)

Cancel reply

  1. Mick P. _ says:

    Off-topic (feel free to moderate away)

    Murray, I don’t want to waste my time going through the regular channels. EM_SETWORDBREAKPROC in the new Windows 10 update introduces a “not responding” (infinite loop) bug if the callback simply returns the character-index it is given. For example, for Asian word-break processing, which wants to just disable it, if the callback doesn’t advance the cursor, (and process the codes) then the msftedit.dll module never returns from WM_CHAR. Versions past must have checked the returned value to see if it’s in bounds.

    1. Mick P. _ says:

      Update: Since the Creator Update (2017) or so. I discovered today that word-wrap ceased working, with the callback I wrote to fix the problem around the Anniversary update.

      It seems like Microsoft discovered problem only after the full public service pack release. And has since fixed it. But I don’t understand how the callback I wrote as an alternative to simply returning the character position (which is a hack, but would work for Chinese style languages) has the effect of disabling word-wrap. (Maybe I was only happy that the program was not unresponsive and didn’t try to go as far as to trigger word-wrap, but I doubt it.)

      Below is the EM_GETWORDBREAKPROC procedure I was using, that today disables word-wrap. There aren’t examples online anymore. Some discussions suggest there is a KB article 109551 with an example. Should the new RichEdit DLL disable word-wrap with the following code? I tried to write a procedure that didn’t return the current position, because that triggered the infinite loop. I hope that Microsoft has somehow back-ported this fix. If the Windows 10 updates are mandatory, hopefully people will no longer experience the Anniversary edition problem. Still, now there is possibly a new bug where this procedure (returning break on every character) can disable word-wrap altogether.

      switch(code)
      {
      //case WB_LEFT: return i;

      //want Japanese style wrapping (disable in other words)
      case WB_ISDELIMITER: return lpch[i]!=’\r’; //paranoia

      //// rich edit codes ////

      //MSDN: How to Use Word and Line Break Information

      case 3/*WB_CLASSIFY*/: return 0x40; //WBF_BREAKAFTER

      //Windows 10 update is going “non responsive” without this.
      case 4: //WB_MOVEWORDLEFT
      case 6: //WB_LEFTBREAK
      case WB_LEFT: return i+1; //Left-to-Right

      case 5: //WB_MOVEWORDRIGHT
      case 7: //WB_RIGHTBREAK
      case WB_RIGHT: return i-1; //Right-to-Left
      }

      1. Mick P. _ says:

        I think that the code should have used WBF_BREAKLINE instead of WBF_BREAKAFTER.

        I might have got the WBF_BREAKAFTER from an example. And did not realize there were other break-modes.

        The difference in the current DLL (late 2017) would seem to be that if the text is all WBF_BREAKAFTER it will never wrap lines. Although I have seen it wrap after doing a paste operation, going back and changing lines above the pasted line, to then make them wrap.

        Perhaps “all WBF_BREAKAFTER” will only appear in procedures that want to always break and don’t know WBF_BREAKLINE is more appropriate. That could be a wide net of breaking programs.

      2. Mick P. _ says:

        PLEASE DISREGARD/DELETE MY “WBF_BREAKLINE” SUGGESTION. ALSO the nonresponsive bug is still with us.

        The Anniversary bug kicks in on word-wrap. Not input. Still stuck 🙁

        (I misspoke because I built a test and thought I observed the results, but it was a stale build or something else.)

        1. Mick P. _ says:

          WHAT WORKS. (I apologize for so many comments. There’s no edit function.)

          After Windows 10 Creators Update…

          I changed WB_ISDELIMITER to always return 0, or FALSE. This is counter-intuitive for making all characters break. It’s probably a new bug. I’m confident it is, because the program would freeze at the end of the line, so I would’ve had to enter text to the end of the line to confirm it wasn’t freezing, and if it wasn’t wrapping then, my work would’ve been unfinished.

          And for good measure, I made WB_CLASSIFY return all 3 break mode flags. That’s probably not necessary. But it’s confusing what’s what.

          (My goal is to get a box that treats all characters equally. Because it’s a fixed width box, and breaks give users unexpected results.)

          1. MurrayS3 says:

            Thanks for the reports. We’ll check into the problems.

Skip to main content