What happened to the Arial Unicode MS font?


A customer wanted to know what happened to the Arial Unicode MS font. It used to be installed with Microsoft Office, but was dropped starting with Microsoft Office 2016. Various conspiracy theories have developed as to why.

Wikipedia cites a Microsoft forum answer that says that

When Microsoft included Arial Unicode MS with earlier versions of Office, Microsoft paid a licensing fee to The Monotype Corporation, which is the copyright holder for the font. Someone at Microsoft decided it was no longer worthwhile to continue paying that fee, so it was removed from the Office package. The price you see now is what Fonts.com charges for a single copy.

The fonts team provided the correct answer: Arial Unicode MS was included with Office at a time when applications did not handle font fallback properly. It was essentially a last-resort font that someone could specify and get relatively reasonable results. As Unicode grew in size, it became clear that the font could not hold every defined character, and it also became difficult to maintain because its size pushed various tables to their maximum. Adding new characters became impractical. Consequently, it was deprecated, since it could no longer perform the job it was originally created for.

As for the font ownership: The answer above has it backward. The font is wholly owned by Microsoft, who licenses it to Monotype.

Comments (17)
  1. DWalker07 says:

    Even though the linked thread is locked, it’s a Microsoft forum page. It might be helpful if someone at MS could post the actually-correct answer there, since there are still people new to Office 2016 who might find that incorrect answer.

  2. D.R. says:

    Have you updated the Wikipedia article? :-)

  3. Martin says:

    >…it became clear that the font could not hold every defined character…
    Is there any font, which really defines all Unicode characters?

    1. Jim says:

      Not that I’m aware of, but check out Google’s Noto: https://www.google.com/get/noto/

      1. In fairness, that’s a font family, not a single font file. That’s pretty similar to Microsoft’s approach, though perhaps in a bit more unified fashion at the expense of style diversity.

        1. kode54 says:

          A single font file may only contain 65536 glyphs anyway, no matter their code points.

  4. Brian says:

    I’m tempted to update Wikipedia, using this blog as a source. However, everyone knows that Raymond is the authoritative expert only on Windows XP, and then, only in France. :)

  5. Sam says:

    What’s impressive is that nobody on the forum suggested sfc /scannow to restore the font.

  6. Ken Hagan says:

    “Arial Unicode MS was included with Office at a time when applications did not handle font fallback properly.”

    So do (nearly) all applications now handle it properly? What are they doing now that they didn’t then, and when did they change? I ask because the way that I render text hasn’t changed in 25 years and I’m worried that Mr Petzold might not have taught me the modern way.

    1. Joshua says:

      The system libraries do it now.

  7. Yuhong Bao says:

    I wonder how much of the Unicode BMP it covered at least.

  8. Mark Ransom says:

    At least we now know who to blame for the ugliness that is Arial.

    I’m glad Microsoft didn’t stop there and kept creating new fonts.

  9. richardguk says:

    ARIALUNI.TTF as of version 1.01 (22,731 KB, 2002, RIP) reports 50,377 glyphs (of the theoretical maximum of 65,535), with some individual stylistic alternate glyphs (OpenType feature ‘salt’).

    arial.ttf as of version 6.98 (1,004 KB, 2017) reports 4,453 glyphs and as well as individual stylistic alternates includes 3 alternative stylistic sets (accessible in CSS as “{font-family: ‘Arial’; font-feature-settings: ‘ss01’}” etc):

    ss01, featuring a seriffed uppercase letter “I” and curve-tailed lowercase “l”

    ss02, featuring unicase letters: lowercase-shaped uppercase letters “A”, “E”, “M”, “N” and “U” (but all with uppercase height), with the lowercase letters “a” to “z” all identical in shape and size to their uppercase equivalents

    ss03, featuring that seriffed uppercase “I” again, a symmetrical “K”, a straight-tailed “Q”, a single-storey “a”, a curveless-descender “j”, a more symmetrical “k”, curve-tailed lowercase letters “l” (again) and “q”, a tailless “t”, a curved “y” (to match the style of the “g”) and digits “6” and “9” with their tails uncurved

    Who needs easter eggs when you have Arial?

    Arial Black similarly has 3 alternative stylistic sets. Other Windows fonts with alternative stylistic sets include Comic Sans MS (3 sets), Trebuchet MS (4 sets), Verdana, Segoe UI and Neue Haas Grotesk Text Pro (which is the Helvetica-equivalent included with Pan-European Supplemental Fonts under Windows 10 optional features).

    1. Lawrence says:

      The theoretical maximum number of Unicode glyphs is actually 1,114,112

      1. OP was referring to the theoretical maximum number of glyphs in a single TrueType or OpenType font file, which is 65535.

  10. Do any other fonts exist in Windows/Office that cover reasonably large part of Unicode?

    1. richardguk says:

      Arial Unicode MS grew so large because, as well as including many early Unicode symbols, it included over 30,000 ideographs for Chinese (Simplified or Traditional), Japanese and Korean characters. Unicode opted to encode these four CJK scripts as one combined group (the “Han unification” controversy) even though the characters would generally appear somewhat differently in each language. Ideographic characters also tend to require displaying in larger sizes than simpler symbols and alphabetical characters and may need vertical layout information that general fonts lack. So more modern fonts do not usually try to encode extensive language-dependent CJK character sets alongside more neutral symbols and alphabetical characters. Similarly, right-to-left scripts with complex ligatures such as Arabic are hard for non-specialist typographers to support. So many recent fonts have good Unicode coverage for general and technical purposes even if they lack the tens of thousands of ideographs and layout information that are essential for specific languages.

      Lucida Sans Unicode (l_10646.ttf) was another early font to include many non-CJK characters (1,779 glyphs as of version 5.00, 2006), bundled with Windows since NT 3.5 and Windows 98. Nowadays, many common Windows and Office fonts have a similar number of characters as accent combinations, Greek and Cyrillic characters and numerous symbols are often included as well as hidden features such as old-style and proportional numerals.

      More recent versions of Windows have included many separate fonts designed to work well with particular CJK or other non-Latin-based scripts, taking account of local layout rules and optimal sizing. But aside from those, Windows 10 fonts that contain large numbers of characters include Segoe UI (5,256 glyphs), Segoe UI Emoji (12,553 glyphs) and Segoe UI Historic (4,599 glyphs). MS Office includes a combined .ttc font collection file for Cambria (5,271 glyphs) and Cambria Math (6,892 glyphs, with extra info for mathematical layout).

Comments are closed.

Skip to main content